Detection of head and neck cancer based on longitudinal ...€¦ · 12/06/2020 · for the...

1

Detection of head and neck cancer based on longitudinal changes in serum protein

abundance

Ju Yeon Lee1,2

, Tujin Shi1, Vladislav A. Petyuk

1, Athena A. Schepmoes

1, Thomas L. Fillmore

1,

Yi-Ting Wang1, Wayne Cardoni

3, George Coppit

3, Shiv Srivastava

4,5, Joseph F. Goodman

6,

Craig D. Shriver4,5

, Tao Liu1,#

, Karin D. Rodland1,7,#

1Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington

2Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju,

Republic of Korea 3Frederick Regional Health System, Frederick, Maryland

4Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services

University of the Health Sciences, Bethesda, Maryland 5John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National

Military Medical Center, Bethesda, Maryland 6Division of Otolaryngology, George Washington University, Washington, District of Columbia

7Department of Cell Developmental and Cancer Biology, Oregon Health and Science University,

Portland, Oregon

Running title: Longitudinal serum biomarkers for early detection of OPSCC

Abbreviation list:

HNSCC: head and neck squamous cell carcinoma

OPSCC: oropharyngeal squamous cell carcinoma

HPV: human papilloma virus

DODSR: Department of Defense Serum Repository

LC: liquid chromatography

SRM: selective reaction monitoring

MS: mass spectrometry

#Corresponding authors:

Tao Liu, Pacific Northwest National Laboratory, 3335 Innovation Avenue, P.O. Box 999, MSIN

K8-98, Richland, WA 99354. Phone: 509-371-6346; E-mail: [email protected]

Karin D. Rodland, Pacific Northwest National Laboratory, 3335 Innovation Avenue, P.O. Box

999, MSIN K8-98, Richland, WA 99354. Phone: 509-430-4668. E-mail:

[email protected]

The authors declare no potential conflicts of interest.

on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from

Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192

http://cebp.aacrjournals.org/

2

Abstract

Background: Approximately 85% of the United States military active duty population is male

and less than 50 years of age with elevated levels of known risk factors for oropharyngeal

squamous cell carcinoma (OPSCC) include smoking, excessive use of alcohol, and greater

numbers of sexual partners and elevated prevalence of human papilloma virus (HPV). Given the

recent rise in incidence of OPSCC related to the HPV, the Department of Defense Serum

Repository provides an unparalleled resource for longitudinal studies of OPSCC in the military

for the identification of early detection biomarkers.

Methods: We identified 175 patients diagnosed with OPSCC with 175 matched healthy controls

and retrieved a total of 978 serum samples drawn at the time of diagnosis, 2 and 4 years prior to

diagnosis, and 2 years after diagnosis. Following immunoaffinity depletion, serum samples were

analyzed by targeted proteomics assays for multiplexed quantification of a panel of 146

candidate protein biomarkers from the curated literature.

Results: Using a Random Forest machine learning approach, we derived a 13-protein signature

that distinguishes cases versus controls based on longitudinal changes in serum protein

concentration. The abundances of each of the 13 proteins remain constant over time in control

subjects. The area under the curve for the derived Random Forest classifier was 0.90.

Conclusions: This 13-protein classifier is highly promising for detection of OPSCC prior to

overt symptoms.

Impact: Use of longitudinal samples has significant potential to identify biomarkers for

detection and risk stratification.




3

Introduction

Head and neck squamous cell carcinomas (HNSCC) can arise from any tissue of the

upper aerodigestive tract (e.g., mouth, nose, pharynx, larynx, sinuses or salivary glands)1. It

ranks among the top-5 most prevalent cancers worldwide with approximately 835,000 incident

cases and 431,000 deaths estimated in 20182. Traditionally, HNSCC is a disease associated with

tobacco and alcohol use3, however, infection of the oropharynx with human papilloma virus

(HPV) is rapidly emerging as a significant new risk factor for oropharyngeal-specific HNSCC

and results in a disease clinically different from traditional HNSCC3. HPV-associated

oropharyngeal squamous cell carcinoma (OPSCC) has become a significant health issue in the

US military population4.

Early-stage HNSCC, including OPSCC, generally responds well to therapy, but two

thirds of new OPSCC cases are first diagnosed at advanced stage III or IV with lymph node

metastases3, when the overall five-year-survival rate is only ~50%

3, 5. This statistic has not

improved significantly for decades (except for in HPV-associated disease), and recurrence is a

major concern for stage III/IV HNSCC associated with elevated mortality5. Therefore, there is an

urgent need for effective biomarkers for early detection, risk stratification, and therapeutic

prognosis of HNSCC.

Previous attempts to identify biomarkers for early detection of HNSCC in biofluids

(saliva, serum, and/or plasma) have produced hundreds of candidate protein biomarkers6-9

, but

there are currently no FDA-approved biomarkers for early detection of HNSCC. Common issues

with previous studies include limited ability to analyze the number of clinical samples required

to avoid over-fitting, and the reliance on comparisons between cases and controls at a single

(diagnostic) timepoint, resulting in substantial overlap in the observed protein abundances




4

between cases and controls. The focus on comparison at the time of diagnosis may further limit

utility for early detection, as tumor burden is already significant at the time of diagnosis.

Longitudinal samples from the same individual over time allow each patient to serve as his/her

own control, enabling the detection of early changes within individual patient physiology as well

as alleviating across-patient heterogeneity and providing lower variation.

The Department of Defense Serum Repository (DODSR) was initiated in 1989 and is

comprised of serum samples from active and reserve military personnel drawn at enlistment,

biennially for routine HIV testing, and pre and post deployment throughout the service members’

participation in the Military Health System, accompanied by the service member’s electronic

health records. Previous studies have used the DODSR to investigate infectious diseases,

autoimmune diseases, multiple sclerosis, multiple myeloma and other cancers such as lymphoma

and breast and testicular cancers10

. Thus, the DODSR represents a unique resource for

longitudinal studies of cancer risk, progression, and response to therapy, especially for head and

neck cancers.

OPSCC represents an ideal target for further study both in terms of the growing

demographic with potential effects on military readiness4, but also because of the growing

understanding of cancer genomics related to HPV-associated and non-associated cancers.

Candidate biomarkers for OPSCC reported in the literature are generally based on small

opportunistic studies and have not been evaluated simultaneously in a large clinical cohort.

Reviews of the literature confirmed multiple proteins known to be mutated in head and neck

cancer, with a specific subset seen in HPV-positive oropharynx cancer, outlined in the 2015

article by Hayes et al11

. A recent study suggested several targets that can be seen in early

oropharyngeal cancer12

. Herein we employed an unbiased multiplexed mass spectrometry (MS)-




5

based targeted proteomics platform for precise identification of serum protein biomarkers,

selected from a rich resource of reported HNSCC biomarkers, and tested in a large cohort of

longitudinal DODSR serum specimens for early detection of HNSCC.

Materials and Methods

DODSR serum specimens

All DODSR serum samples used in this study were collected between 2003 and 2013, and

processed, aliquoted and stored using established protocols adopted by the DODSR13

. We

identified 3,160 cases that fit the OPSCC primary site and diagnosis date requirements, and this

number was reduced to 175 subjects who met the Active Duty requirement. We analyzed at least

two and up to four serum samples from each case. The refence specimen, designated “Dx”, is the

routine DODSR serum sample drawn closest to the time of the initial OPSCC diagnosis for cases

(within 1 year following the diagnosis date). The DODSR was searched for the routine blood

draw prior to the OPSCC diagnosis (PreB), and the routine blood draw before that one (PreA).

On average, this represented 41 years prior to diagnosis (PreA) and 21 years prior to diagnosis

(PreB), but the interval is not exact, due to logistic issues such as deployment. When available,

the routine blood draw subsequent to diagnosis was also retrieved (PostD). Corresponding

samples were also obtained from 175 healthy controls, matched by 1) age at the time of

diagnosis, 2) gender, and 3) time of the blood draw (within 1 year of each case’s specimen). In

total there were 978 serum samples. The experimental design and overall distribution of serum

collection timepoints per case and control groups is shown in Figure 1 and Table 1; Table S1

describes the clinical information of the serum samples in more detail. The average patient age

was 45 (ranging from 21 to 64) with 171 males and 4 females. This distribution reflects both the




6

difference in incidence of OPSCC in males and females, and the distribution of males and

females in the military population. Diagnoses were limited to ICD-O codes corresponding to

oropharyngeal primary tumors, irrespective of HPV status. Of the 175 patients, HPV status was

confirmed in 35 cases; 18 were confirmed HPV-positive and 17 were confirmed HPV-negative

(Table S1).

Selection of candidate OPSCC biomarkers for targeted proteomics

Candidate targets were selected by searching PubMed and Web of Science for

publications containing the keywords ‘biomarkers, HNSCC, oral cancer, oropharyngeal cancer,

head and neck cancer’ and curating the results using a genescraper R package14

. A total of 277

candidate biomarkers associated with either HNSCC specifically (131) or identified as general

cancer-associated proteins (146) were identified in the curated literature. Studies that focused on

serum, saliva, and tissue biomarkers in the diagnosis of HNSCC compared with controls were

considered. Thirty-six studies were identified, and 209 proteins were initially selected as OPSCC

candidate biomarkers (Table S2). Members of six signaling pathways (i.e., cell cycle

deregulation by HPV, TGF-β pathway, MAPK pathway, PI3K/AKT/mTOR pathway, NF-B

pathway, and Wnt pathway) reported to be closely related to OPSCC were also included. These

candidate markers were further evaluated by three clinician co-authors with expertise in head and

neck cancer. The clinical experts also added protein targets directly related to HPV pathogenesis

in OPSCC. We next evaluated the detectability of these protein markers in human serum or

plasma based on a recent in-depth plasma proteome dataset15

and unpublished in-house plasma

proteome datasets, and found 166 proteins that were detected in either or both plasma proteome

datasets. Candidate biomarkers were further down-selected in a stepwise workflow that included




7

both expert assessment of clinical relevance and actual MS detectability of the proteins in a

mixture of the DODSR serum samples. The final panel consisted of 146 candidate protein

biomarkers and targeted proteomic assays were developed for analysis in the large cohort of

DODSR samples using liquid chromatography coupled to selective reaction monitoring (LC-

SRM). Detailed description of SRM assay development and characterization can be found in

Supplementary Materials. The final assay conditions for scheduled LC-SRM analysis of the

146 candidate protein biomarkers are provided in Table S3.

Preparation of serum samples

The serum samples were subjected to immunoaffinity depletion for removal of 14 high

abundance proteins, followed by protein digestion, sample cleanup, and the addition of heavy

isotope labeled internal peptide standards for targeted proteomics analysis (Figure 2 and

Supplementary Materials). Two types of unblinded quality control (QC) were implemented to

validate the stability and reproducibility of the entire workflow: 1) external QC with the use of

commercially available serum (purchased from Biochemed Services (Winchester, VA), and 2)

internal QC with spiking of two exogenous bovine and yeast proteins into each immunoaffinity

depleted serum sample including the external QC serum samples. Detailed descriptions of the

external and internal QC standards and procedures and the detailed process for sample

randomization and order of analysis is described in Supplementary Materials.

LC-SRM measurements and data analysis

LC-SRM analysis was performed on a Waters nanoACQUITY UPLC system interfaced

to a Thermo Scientific TSQ Altis triple quadrupole mass spectrometer for simultaneous




8

quantification of a total of 308 peptides from the 146 candidate proteins and 2 exogenous

internal standard proteins across 978 serum samples in the retention time scheduled SRM mode.

Detailed descriptions of the LC-SRM analysis can be found in Supplementary Materials.

SRM data were analyzed using Skyline software. Peak detection and integration were

determined16

based on (1) same retention time; (2) approximately same relative SRM peak

intensity ratios across multiple transitions between light (L) peptides and heavy (H) peptide

standards. The L/H peak area ratios were automatically calculated by Skyline software17

and

exported to a csv file. All data were manually inspected to ensure correct peak detection and

accurate integration. Signal to noise ratio (S/N) was calculated by the peak apex intensity over

the mean background noise in a retention time region of ±15 s for the target peptides. The

background noise levels were conservatively estimated by visually inspecting chromatographic

peak regions. If the S/N ratios of light peptides were <3 or there was significant interference, the

target was reported as “Not Analyzed” using the abbreviation “NA”. If the endogenous light

peptide could not be detected, resulting in an L/H ratio of “zero” due to the lack of endogenous

light peptide signal, the target was designated as “Not Detectable” (ND). Data labeled as NA and

ND were not used for the statistical analysis. There were no significant outliers in this analysis,

as determined by Principle Component Analysis; specifically, there were no outliers attributable

to a high number of missing values (Figure S1). All the annotated Skyline output files have been

uploaded to Panorama website and are available via https://panoramaweb.org/1MV2jg.url.

Data processing and statistical analysis

The L/H ratios were log2-transformed and zero-centered, followed by batch correction

using ComBat18

, and peptide-to-protein rollup using RRollup19

. Quality of protein quantification




9

was estimated based on variances across clinical DODSR samples and external QC serum

replicate samples (detailed information on QC serum samples can be found in Supplementary

Material). Total variance represents both technical noise, due to variability in measurement, and

biological noise, due to biological differences between samples (i.e., inter-individual variability).

The variance measured in the technical replicates of the QC samples provides a good estimate of

technical noise. Thus, the ratio of the experimental sample variance to the QC sample variance

provides a reasonable estimate of the S/N. By eliminating any proteins from further analysis with

a S/N <2, we ensured that most of the contribution to protein abundance differences from sample

to sample was due to biological differences rather than technical noise. This resulted in 86

proteins passing the threshold.

For statistical significance testing we used protein relative abundances at individual

timepoints (PreA, PreB, Dx and PostD). Additionally, we used the longitudinal changes between

the timepoints (PreAPreB, PreBDx; and PreADx). The PostD timepoint was not used for

longitudinal analysis as it reflects treatment effect, rather than early detection. The t-tests were

based on limma R package20

that allows for empirical Bayes approach for better estimation of

protein variances. The tests were two-tailed and unpaired. To control for multiplicity of

hypothesis testing, the p-values (< 0.05 significance level) were adjusted using the Benjamini-

Hochberg approach21

, which is likely to be conservative as it does not account for the correlation

structure of the data22

.

A Random Forest approach was used to develop classifiers between case and control

groups. Features used for classification were based on protein relative abundances at each

individual timepoints (PreA, PreB, Dx and PostD) and the longitudinal changes observed

comparing the different timepoints: PreAPreB, PreADx, and PreBDx. Age at diagnosis




10

and sex of a subject were also considered as covariates. Prediction accuracy was estimated using

leave-one-out cross validation (LOOCV). During each round of LOOCV, we selected proteins

out of the entire set of 86 using the “boruta” algorithm (Boruta R package23

) on N-1 samples.

Then we trained a Random Forest classifier model (randomForest R package24

) using relative

abundances or log2 longitudinal change of selected proteins as features. For some analyses we

augmented protein data with age at diagnosis and sex as additional predictive features. The

trained Random Forest model then was used to compute the probability that the hold-out sample

belongs to the case class. Accuracy of the classifier was assessed using area under the curve

(AUC) (ROCR R package25

) composed from LOOCV predictions. The proteins selected into the

models in >50% of LOOCV rounds were carried into the final suggested signature.

Results

Multiplex quantification of candidate markers in the DODSR serum samples

Robust targeted proteomics assays were established for multiplex quantification of 146

candidate protein biomarkers across 489 serum samples from 175 OPSCC cases and the 489

serum samples from 175 matched controls. Table S4 provides the quantitation results for all 146

candidates and the 2 exogenous QC proteins across the entire set of 978 DODSR serum samples

and the external QC serum samples. The dispersion of SRM signals for surrogate peptides

derived from the 146 protein markers indicated relatively stable measurements (CV: 25.41%)

across all the external QC samples while the same measurement in the DODSR samples (both

cases and controls) showed much higher variation (CV: 42.81%) (Figure S2A, top panels),

reflecting good reproducibility of the entire workflow. This was further supported by highly

stable SRM signals for surrogate peptides derived from two internal QC proteins across all the




11

OPSCC (CV: 14.27%) and external QC samples (CV: 16.09%) (Figure S2A, bottom panels);

note that the external QC proteins were added after the immunoaffinity depletion step (Figure

2), hence having even lower variation compared to that in the target protein measurements. As

expected, there is no statistical difference for two internal QC proteins between cases and control

groups (Figure S2B).

Identification of proteomic classifiers for OPSCC

Relative protein abundance obtained from the L/H ratios of surrogate peptides across

different serum samples were first analyzed by t-test for the differentially abundant proteins

(Table S5). Between cases and controls, the number of statistically significant (adjusted p-values

≤0.05) proteins for static timepoints were 24 and 19 for the Dx and PostD timepoints,

respectively; none of the proteins in the PreA and PreB timepoints were statistically significant.

For the longitudinal abundance changes, the number of statistically significant proteins were 13

and 8 for PreBDx and PreADx comparisons, respectively (Table S5). All of the proteins

consistently selected for Dx or PreBDx classification models were significant in t-test. We

were unable to detect any statistically significant differences between confirmed HPV-positive

and confirmed HPV-negative cases, in part due to the limited number of patients with confirmed

HPV status. Similarly, we were unable to detect any statistically significant differences

correlating with smoking status.

Attempts to use the slope of protein abundance changes computed using actual sample

draw dates to compare between case and control groups were complicated by the variability in

the timing between the actual clinical diagnosis and the timing of the Dx blood draw. Thus, we

decided to model PreB versus Dx as two discrete states.




12

A Random Forest machine learning approach was used to classify between cases and

controls based on relative abundance changes at each timepoint and longitudinal changes over

time. The AUCs for individual timepoints were 0.44 for PreA, 0.46 for PreB, 0.76 for Dx, and

0.73 for PostD (Figure 3A). For the longitudinal changes, the AUC values were 0.48 for

PreAPreB, 0.80 for PreADx, and 0.90 for PreBDx (Figure 3B). Since we were selecting

features independently for each round of LOOCV, we need to evaluate consistency of how often

a certain protein was selected into the Random Forest model (Figure 4). For the best performing

classification based on PreB Dx change (AUC 0.90), 8 proteins (SPARC, SERPINE1,

SERPIND1, SELE, LRG1, HPX, GSN, CP) were selected into the model in 100% of rounds.

Two more proteins (CTSH and CKM) were selected in >97% of rounds. Three proteins (AHSG,

SAA4, CD44) had minor contributions and were selected in <20% of rounds (Figure 4). We

selected ten proteins that were consistently selected into the classification models and passed a

50% threshold to be carried over for further signature validation. Of the models based on just one

timepoint, the Dx timepoint (closest to diagnosis) was the most predictive (AUC 0.76). Proteins

consistently (>50% LOOCV iterations) selected into the classification models based on Post1

abundance (SPARC, PKM, LRG1, GSN, CKM, HPX, MDH1, KNG1, SERPINE1) had

substantial overlap with consistent proteins from PreBDx models (Figure 4). Six proteins

(SPARC, SERPINE1, LRG1, GSN, CKM, HPX) are common between PreBDx and Dx

models; four proteins (CP, CTSH, SELE, SERPIND1) are unique to PreBDx models; three

proteins (KNG1, MDH1, PKM) are unique to Dx models. In total we found 13 proteins that are

consistently contributing to classification models based on longitudinal or cross-sectional data.

All of the proteins consistently selected for Dx and PreBDx classification models were

significant in t-test (see Table S5 and Figure 4). Out of the common six proteins, four (SPARC,




13

SEPRINE1, GSN, CKM) are lower in cases compare to controls at the Dx timepoint, two (LRG1

and HPX) are higher. The directionality of abundance change between PreB and Dx timepoints

is also the same. The correlation structure of protein abundances at the Dx timepoint and PreB

Dx changes is shown in Figure S3. The overlap between proteins significant in t-test and

selected for Random Forest classifier is substantial, but not 100%.

Table S6 presents the analytical performance of the SRM assays including the lower

limit of quantitation and measurement reproducibility for the 10 proteins consistently detected in

over 97% of OPSCC samples from the longitudinal change PreBDx. The overall achievable

reproducibility of the measurements over time for these 10 proteins was mostly <20% in CV.

The details of these experiments are provided in Supplementary Materials.

We evaluated the stability of abundance of the 13 proteins which are consistently

contributing to classification models based on longitudinal or cross-sectional data in the 36

control samples that have all four timepoints available. To test for changes over time we applied

regression analysis of protein abundance vs. actual date of blood draw with the subject being

modeled as a fixed effect. None of the proteins has a statistically significant association of

abundance with time in the control group (Table S7). Additionally, we computed coefficient of

variation as a measure of temporal stability. In the control group the coefficients of variation

over time for the 13 proteins were in the range 6-36% (Figure S4).

Considerations for patient stratification

Besides protein abundances, we considered age at diagnosis, sex and HPV status as

covariates in the analysis of stratification factors. Only a minority of case subjects had a

confirmed HPV diagnosis. Specifically, 17 cases were HPV-negative, 18 HPV-positive; 140 had




14

no diagnosis. Thus, most likely due to low power, no statistical difference was detected in

protein abundances between HPV-positive and HPV-negative cases. Given the male:female sex

ratio of the military population, the distribution of sex as a confounding factor was very skewed.

Out of 175 case/control pairs, only 4 pairs are female. Thus, including sex as a confounding

factor into the linear model or Random Forest classifier training had no effect on the results. Age

at diagnosis had no statistically significant effect on the results of differential abundance testing

and was not selected as predictive feature by the Boruta algorithm in Random Forest machine

learning approach.

Discussion

Many HNSCC protein biomarkers have been reported in tissues, saliva, and

serum/plasma6-9

, but none of them has been translated into clinical practice. A contributing factor

in this failure to validate and convert candidates into robust, FDA-approved biomarkers is the

relatively small set of clinical samples analyzed. MS-based targeted proteomics is highly

promising to bridge the gap between discovery and validation phases because it allows

quantification of hundreds of proteins simultaneously with high specificity and reproducibility,

without the need to generate affinity reagents. In this study, a stepwise approach was used to

transition from the curated literature to the finalized list based on detectability using LC-SRM

assays in real samples; the final 146 protein markers were measured by LC-SRM across 978

serum samples from OPSCC cases and matched controls. The robustness of the entire LC-SRM

workflow was monitored using two types of QCs to confirm precise quantification of target

proteins in the large cohort of DODSR serum samples. All these results demonstrated that the

entire LC-SRM workflow is robust for reliable quantification of the 146 protein biomarkers




15

across the 978 serum samples.

An important distinction of this study, compared to most biomarker discovery studies, is

the emphasis on longitudinal comparisons across timepoints, as compared to a focus on

differential protein expression between cases and controls at one single timepoint. Our

longitudinal study using the DODSR samples allowed us to identify biomarkers that differed

within the same individual across multiple timepoints. This approach has significant advantages

over conventional biomarker studies in terms of precise identification of biomarkers for early

detection of OPSCC and effectively addressing issues with human heterogeneity. Statistical

analysis of SRM data from the 978 DODSR serum samples supports this conclusion, as the SRM

data comparing cases and controls at one single timepoint AUC values is 0.76 and 0.73 for the

Dx and PostD timepoints, respectively, compared to an AUC of 0.90 for the classifier based on

longitudinal SRM data comparing for PreBDx within the same individual. For the PreADx

comparison, the AUC value is 0.80, suggesting that changes may be discernible as early as 4

years prior to diagnosis, but may not be sufficiently robust for clinical utility. An important

observation in this study is that abundance of these proteins in the classifier was extremely stable

over time in the control samples, indicating that a strategy of comparing biomarker abundance

measurements to the patient’s own serum protein profile over time may have significant clinical

utility. The observation of statistically significant changes across the longitudinal OPSCC serum

samples from the same individual appears to be a reliable indicator of true abundance changes,

which further validated our biomarker discovery.

Among the best classifiers for PreADx and PreBDx, there are seven overlapping

proteins: hemopexin (HPX), ferritin light chain (FTL), leucine-rich alpha-2-glycoprotein

(LRG1), plasminogen activator inhibitor 1 (SERPINE1), creatine kinase M-type (CKM), gelsolin




16

(GSN), and ceruloplasmin (CP). This suggests that they are more robust than the other protein

markers and highly promising for early detection of OPSCC. For example, increased expression

levels of FTL and ferritin heavy chain (FTH1) have been observed in HNSCC tumor and

HNSCC tumor with lymph node metastasis tissue samples using the chemiluminescent

immunoassay method26

. Leucine-rich alpha-2-glycoprotein was downregulated in HNSCC

tissues regardless of the grade of the tumor27

, and plasminogen activator inhibitor 1 was detected

in the secretome of head and neck/oral squamous cell carcinoma cell lines28

. The level of

creatine kinases such as creatine kinase M-type is a useful criterion for several diseases related to

HNSCC29, 30

. Gelsolin was one of five proteins on a diagnostic panel derived from comparing 61

oral squamous cell carcinoma (OSCC) and 58 control saliva samples31

. Hemopexin was

demonstrated to be differentially expressed between lymph-node metastatic and non-metastatic

OSCC serum samples31

. Ceruloplasmin was up-regulated in plasma samples of hypopharyngeal

squamous cell carcinoma32

.

Another important aspect of this study is the power of the combination of multiplexed

targeted proteomics and high-quality longitudinal clinical specimens for unbiased precise

prioritization of protein biomarkers starting with a large biomarker list curated from the

literature. The multiplexed biomarker discovery workflow can be easily implemented with

simultaneous quantification of ~500 candidate proteins for any type of cancer by full utilization

of the existing biomarker resource without the need to conduct additional discovery studies.

Furthermore, targeted proteomics is more effective than global proteomics for deep biomarker

discovery in serum due to the low concentration of potential biomarkers and the wide dynamic

concentration range in human serum or plasma. However, targeted proteomic studies may miss

other important OPSCC biomarkers because they may rely on an existing list of protein




17

biomarkers from the literature rather than an unbiased, in-depth global proteomics analysis with

extensive sample fractionation.

One important consideration in the generalization of these results is a comparison of the

characteristics of our study population to the general population. To the best of our knowledge,

there have been no published studies directly comparing the incidence rate or prevalence of

OPSCC in the general US population and the US military populations. However, it is relevant to

point out that the US military population has a higher incidence of high-risk HPV serotypes,

specifically HPV-16, than the general population, and also a higher prevalence of other risk

factors including smoking and alcohol use4, 33

. The study reported here is a retrospective study of

all available OPSCC cases in the DOD serum repository, and thus reflects the incidence of

OPSCC in the military population, both male and female. However, females appear to be

significantly under-represented in this study, compared to the expected incidence in females in

the general population (approximately one-quarter that in males)34

. This appears to reflect the

under-representation of females in the military population (15%)35

, rather than any differences in

incidence within female members of the military. Thus, it is likely that this study will be most

applicable to screening of high-risk males in the general population. It is worth noting that we

saw no statistically significant changes when we eliminated females from the analysis.

In conclusion, access to high-quality serum sample cohorts from the DODSR enabled

targeted proteomics discovery of promising biomarkers for early detection of OPSCC based on

longitudinal data from the same patient. Random Forest analysis of abundance data for146

candidate proteins simultaneously measured by robust SRM assays across 978 DODSR serum

samples indicated that comparisons across time in the same patient are superior to single

timepoint comparisons across patients. While direct comparison of protein abundances at the Dx




18

timepoint for cases and controls showed statistically significant differences in candidate

biomarker abundance with AUC >0.70, analysis of longitudinal samples provided substantially

improved AUC values of 0.90 for the PreBDx classifier and 0.80 for the PreADx classifier.

This is due to the observation that OPSCC patients displayed statistically significant changes in

protein abundance as early as two years prior to diagnosis, effectively captured in the 13-protein

classifier, but the abundance of these proteins was constant over time in the matched controls.

While the use of the longitudinal DODSR serum samples enabled the identification of

these highly promising markers for early detection of OPSCC, there may be some limitations on

the generalizability of this study. First, the military population is both younger and more

predominantly male than the general population at risk of OPSCC. Conversely, the military

population has an increased incidence of known risk factors for OPSCC, including increased

tobacco and alcohol use and an increased incidence of HPV infection33

. The use of p16 has

universally become a surrogate marker for HPV-driven OPSCC; however, due to the use of

serum samples dating back to 2003, most samples were not tested for HPV in the majority of the

cohort. The 34 reported cases were tested with in-situ hybridization for low and high-risk strains

of HPV; nevertheless, based on demographics alone this cohort would be considered to have a

high prevalence of HPV-driven OPSCC, perhaps with intermediate risk factors due to

concomitant tobacco use36

. Absence of HPV-specific markers such as E6 and E7 has been

documented elsewhere and postulated to be perhaps a marker of more advanced disease12

.

Another possibility is that presence of antibodies to E6 and E7 viral proteins may not correlate

with the actual proteins, if they are expressed transiently and subsequently degraded. Specific

data on smoking and alcohol use within this cohort is also incomplete, so that there is insufficient

power to analyze these confounding variables separately. Variations in the time interval between




19

clinical diagnosis and obtaining the Dx serum sample may have introduced variability related to

treatment effects, but this variation appears to be captured in the 95% confidence interval.

Validation in a large independent clinical cohort is needed to determine their general utility in

clinical practice, and thus justify the use of longitudinal serum analyses as a screening strategy

for the general population.




20

Disclosure of Potential Conflicts of Interest

There are no potential conflicts of interest to disclose.

Authors' Contributions

Conception and design: J. Goodman, C.D. Shriver, T. Liu, K.D. Rodland.

Development of methodology: J.Y. Lee, V.A. Petyuk.

Acquisition of data: J.Y. Lee, T. Shi, A.A. Schepmoes, T.L. Fillmore.

Analysis and interpretation of data: J.Y. Lee, T. Shi, V.A. Petyuk, Y.-T. Wang, W. Cardoni, G.

Coppit, J. Goodman, C.D. Shriver, T. Liu, K.D. Rodland.

Writing, review, and/or revision of the manuscript: J.Y. Lee, T. Shi, V.A. Petyuk, J. Goodman,

C.D. Shriver, T. Liu, K.D. Rodland.

Administrative, technical, or material support: T.L. Fillmore, S. Srivastava.

Study supervision: C.D. Shriver, T. Liu, K.D. Rodland.

Acknowledgments

The authors would like to thank the clinical and laboratory staff at the Uniformed Services

University of the Health Sciences and Pacific Northwest National Laboratory (PNNL). Portions

of the research was performed in the Environmental Molecular Sciences Laboratory

(grid.436923.9), a U.S. Department of Energy (DOE) Office of Biological and Environmental

Research national scientific user facility on the PNNL campus. PNNL is a multiprogram national

laboratory operated by Battelle for the DOE under contract DE-AC05-76RL01830. The contents

of this publication are the sole responsibility of the author(s) and do not necessarily reflect the

views, opinions or policies of Uniformed Services University of the Health Sciences, The Henry




21

M. Jackson Foundation for the Advancement of Military Medicine, Inc., the Department of

Defense or the Departments of the Army, Navy, or Air Force. Mention of trade names,

commercial products, or organizations does not imply endorsement by the U.S. Government.

Grant Support

This work was supported by Federal Award No. HU0001-16-2-0014 (Subaward No. 3879 to

K.D. Rodland and T. Liu).




22

References

1. Marcu LG, Yeoh E. A review of risk factors and genetic alterations in head and neck

carcinogenesis and implications for current and future approaches to treatment. J Cancer Res

Clin Oncol. 2009;135: 1303-1314.

2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics

2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185

countries. CA Cancer J Clin. 2018;68: 394-424.

3. Leemans CR, Braakhuis BJ, Brakenhoff RH. The molecular biology of head and neck cancer.

Nat Rev Cancer. 2011;11: 9-22.

4. Feinstein AJ, Shay SG, Chang E, Lewis MS, Wang MB. Treatment outcomes in veterans with

HPV-positive head and neck cancer. Am J Otolaryngol. 2017;38: 188-192.

5. Schaaij-Visser TB, Brakenhoff RH, Leemans CR, Heck AJ, Slijper M. Protein biomarker

discovery for head and neck cancer. J Proteomics. 2010;73: 1790-1803.

6. Li SX, Yang YQ, Jin LJ, Cai ZG, Sun Z. Detection of survivin, carcinoembryonic antigen and

ErbB2 level in oral squamous cell carcinoma patients. Cancer Biomark. 2016;17: 377-382.

7. Allegra E, Trapasso S, La Boria A, et al. Prognostic role of salivary CD44sol levels in the

follow-up of laryngeal carcinomas. J Oral Pathol Med. 2014;43: 276-281.

8. Pereira LH, Reis IM, Reategui EP, et al. Risk Stratification System for Oral Cancer Screening.

Cancer Prev Res (Phila). 2016;9: 445-455.

9. Hsiao YC, Chi LM, Chien KY, et al. Development of a Multiplexed Assay for Oral Cancer

Candidate Biomarkers Using Peptide Immunoaffinity Enrichment and Targeted Mass

Spectrometry. Mol Cell Proteomics. 2017;16: 1829-1849.

10. Perdue CL, Cost AA, Rubertone MV, Lindler LE, Ludwig SL. Description and utilization of

the United States department of defense serum repository: a review of published studies,

1985-2012. PLoS One. 2015;10: e0114857.

11. Hayes DN, Van Waes C, Seiwert TY. Genetic Landscape of Human Papillomavirus-

Associated Head and Neck Cancer and Comparison to Tobacco-Related Tumors. J Clin

Oncol. 2015;33: 3227-3234.

12. Tuhkuri A, Saraswat M, Makitie A, et al. Patients with early-stage oropharyngeal cancer can

be identified with label-free serum proteomics. Br J Cancer. 2018;119: 200-212.




23

13. Perdue CL, Eick-Cost AA, Rubertone MV. A Brief Description of the Operation of the DoD

Serum Repository. Mil Med. 2015;180: 10-12.

14. https://github.com/evanamartin/genescraper.

15. Keshishian H, Burgess MW, Specht H, et al. Quantitative, multiplexed workflow for deep

analysis of human blood plasma and biomarker discovery by mass spectrometry. Nat Protoc.

2017;12: 1683-1701.

16. Song E, Gao Y, Wu C, et al. Targeted proteomic assays for quantitation of proteins identified

by proteogenomic analysis of ovarian cancer. Sci Data. 2017;4: 170091.

17. MacLean B, Tomazela DM, Shulman N, et al. Skyline: an open source document editor for

creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26: 966-968.

18. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using

empirical Bayes methods. Biostatistics. 2007;8: 118-127.

19. Polpitiya AD, Qian WJ, Jaitly N, et al. DAnTE: a statistical tool for quantitative analysis of -

omics data. Bioinformatics. 2008;24: 1556-1558.

20. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for

RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47.

21. Benjamini Y, Hochberg Y. Controlling the false discovery rate - a practical and powerful

approach to multiple testing. J. Royal Statist. Soc., Series B. 1995;57: 289-300.

22. Stevens JR, Al Masud A, Suyundikov A. A comparison of multiple testing adjustment

methods with block-correlation positively-dependent tests. PLoS One. 2017;12: e0176124.

23. Miron Kursa WR. Feature Selection with the Boruta Package. J Stat Softw. 2010;36: 1-13.

24. https://cran.r-project.org/web/packages/randomForest/index.html.

25. https://cran.r-project.org/web/packages/ROCR/index.html.

26. Hu Z, Wang L, Han Y, et al. Ferritin: A potential serum marker for lymph node metastasis in

head and neck squamous cell carcinoma. Oncol Lett. 2019;17: 314-322.

27. Wang Y, Chen C, Hua Q, et al. Downregulation of leucinerichalpha2glycoprotein 1

expression is associated with the tumorigenesis of head and neck squamous cell carcinoma.

Oncol Rep. 2017;37: 1503-1510.

28. Ralhan R, Masui O, Desouza LV, Matta A, Macha M, Siu KW. Identification of proteins

secreted by head and neck cancer cell lines using LC-MS/MS: Strategy for discovery of

candidate serological biomarkers. Proteomics. 2011;11: 2363-2376.




24

29. Yamauchi K, Kogashiwa Y, Nagafuji H, Kohno N. Head and neck cancer with

dermatomyositis: a report of two clinical cases. Int J Otolaryngol. 2010;2010: 401825.

30. Cinamon U. Exceptionally elevated creatine kinase levels in a laryngectomized patient:

hypothyroid myopathy. J Laryngol Otol. 2004;118: 651-652.

31. Chen YT, Chen HW, Wu CF, et al. Development of a Multiplexed Liquid Chromatography

Multiple-Reaction-Monitoring Mass Spectrometry (LC-MRM/MS) Method for Evaluation of

Salivary Proteins as Oral Cancer Biomarkers. Mol Cell Proteomics. 2017;16: 799-811.

32. Tian WD, Li JZ, Hu SW, et al. Proteomic identification of alpha-2-HS-glycoprotein as a

plasma biomarker of hypopharyngeal squamous cell carcinoma. Int J Clin Exp Pathol.

2015;8: 9021-9031.

33. Agan BK, Macalino GE, Nsouli-Maktabi H, et al. Human papillomavirus seroprevalence

among men entering military service and seroincidence after ten years of service. MSMR.

2013;20: 21-24.

34. https://www.cdc.gov/cancer/hpv/statistics/cases.htm.

35. https://www.cfr.org/article/demographics-us-military.

36. Gillison ML, D'Souza G, Westra W, et al. Distinct risk factor profiles for human

papillomavirus type 16-positive and human papillomavirus type 16-negative head and neck

cancers. J Natl Cancer Inst. 2008;100: 407-420.




25

Table 1. Summary of the overall distribution of serum collection timepoints per case and control

groups.

Samples

PreA PreB Dx PostD

-4 years -2 years within 1 year +2 years

(1 year) (1 year) (+1 year) (1 year)

Cases

# of samples 158 157 77 97

Years

min -3.0 -1.1 0 1.1

avg S.D. -4.00.5 -2.00.4 0.40.3 1.90.5

max -4.9 -3.0 1.0 2.9

Controls

(1 year)

# of samples 158 157 77 97

Years

min -2.9 -0.4 -0.9 0.4

avg S.D. -4.10.5 -2.10.6 0.40.6 1.90.7

max -5.7 -3.5 1.8 3.2




26

Figure Legends:

Figure 1. The experimental design of the OPSCC study using serum samples from the DODSR.

Histograms representing number of subjects (Y axis) over time of sample collection (X axis).

Cases were binned into four timepoints for the comparative analysis. Controls were matched to

cases based on age at the time of diagnosis and the availability of serum samples matched on

time of blood draw. Each bin represents 2-month period.

Figure 2. Experimental workflow for identification of serum protein biomarkers for early

detection of OPSCC using multiplexed targeted proteomics. Both internal and external QCs were

used throughout the LC-SRM analysis to ensure robust, high-quality proteomics analysis.

Figure 3. The receiver operating curves (ROC) of the Random Forest classifiers for the single

timepoints (A) and for longitudinal changes between different timepoints (B). Both the AUC and

the ROC curve are provided with 95% confidence intervals for each comparison between the

case and control groups.

Figure 4. The selection frequency for the differential proteins being selected into the Random

Forest classifiers for the single timepoints (top row) and for longitudinal changes between

different timepoints (bottom row).
















Published OnlineFirst June 12, 2020.Cancer Epidemiol Biomarkers Prev JU YEON LEE, Tujin Shi, Vladislav A. Petyuk, et al. changes in serum protein abundanceDetection of head and neck cancer based on longitudinal

Updated version

10.1158/1055-9965.EPI-20-0192doi:

Access the most recent version of this article at:

Material

Supplementary

http://cebp.aacrjournals.org/content/suppl/2020/06/12/1055-9965.EPI-20-0192.DC1

Access the most recent supplemental material at:

Manuscript

Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been

E-mail alerts related to this article or journal.Sign up to receive free email-alerts

Subscriptions

Reprints and

[email protected] at

To order reprints of this article or to subscribe to the journal, contact the AACR Publications

Permissions

Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)

.http://cebp.aacrjournals.org/content/early/2020/06/12/1055-9965.EPI-20-0192To request permission to re-use all or part of this article, use this link



http://cebp.aacrjournals.org/lookup/doi/10.1158/1055-9965.EPI-20-0192

http://cebp.aacrjournals.org/content/suppl/2020/06/12/1055-9965.EPI-20-0192.DC1

http://cebp.aacrjournals.org/cgi/alerts

mailto:[email protected]

http://cebp.aacrjournals.org/content/early/2020/06/12/1055-9965.EPI-20-0192


Date post:	17-Aug-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Detection of head and neck cancer based on longitudinal ...€¦ · 12/06/2020 · for the...

Documents