1
Detection of head and neck cancer based on longitudinal changes in serum protein
abundance
Ju Yeon Lee1,2
, Tujin Shi1, Vladislav A. Petyuk
1, Athena A. Schepmoes
1, Thomas L. Fillmore
1,
Yi-Ting Wang1, Wayne Cardoni
3, George Coppit
3, Shiv Srivastava
4,5, Joseph F. Goodman
6,
Craig D. Shriver4,5
, Tao Liu1,#
, Karin D. Rodland1,7,#
1Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington
2Research Center for Bioconvergence Analysis, Korea Basic Science Institute, Cheongju,
Republic of Korea 3Frederick Regional Health System, Frederick, Maryland
4Murtha Cancer Center Research Program, Department of Surgery, Uniformed Services
University of the Health Sciences, Bethesda, Maryland 5John P. Murtha Cancer Center, Uniformed Services University and Walter Reed National
Military Medical Center, Bethesda, Maryland 6Division of Otolaryngology, George Washington University, Washington, District of Columbia
7Department of Cell Developmental and Cancer Biology, Oregon Health and Science University,
Portland, Oregon
Running title: Longitudinal serum biomarkers for early detection of OPSCC
Abbreviation list:
HNSCC: head and neck squamous cell carcinoma
OPSCC: oropharyngeal squamous cell carcinoma
HPV: human papilloma virus
DODSR: Department of Defense Serum Repository
LC: liquid chromatography
SRM: selective reaction monitoring
MS: mass spectrometry
#Corresponding authors:
Tao Liu, Pacific Northwest National Laboratory, 3335 Innovation Avenue, P.O. Box 999, MSIN
K8-98, Richland, WA 99354. Phone: 509-371-6346; E-mail: [email protected]
Karin D. Rodland, Pacific Northwest National Laboratory, 3335 Innovation Avenue, P.O. Box
999, MSIN K8-98, Richland, WA 99354. Phone: 509-430-4668. E-mail:
The authors declare no potential conflicts of interest.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
2
Abstract
Background: Approximately 85% of the United States military active duty population is male
and less than 50 years of age with elevated levels of known risk factors for oropharyngeal
squamous cell carcinoma (OPSCC) include smoking, excessive use of alcohol, and greater
numbers of sexual partners and elevated prevalence of human papilloma virus (HPV). Given the
recent rise in incidence of OPSCC related to the HPV, the Department of Defense Serum
Repository provides an unparalleled resource for longitudinal studies of OPSCC in the military
for the identification of early detection biomarkers.
Methods: We identified 175 patients diagnosed with OPSCC with 175 matched healthy controls
and retrieved a total of 978 serum samples drawn at the time of diagnosis, 2 and 4 years prior to
diagnosis, and 2 years after diagnosis. Following immunoaffinity depletion, serum samples were
analyzed by targeted proteomics assays for multiplexed quantification of a panel of 146
candidate protein biomarkers from the curated literature.
Results: Using a Random Forest machine learning approach, we derived a 13-protein signature
that distinguishes cases versus controls based on longitudinal changes in serum protein
concentration. The abundances of each of the 13 proteins remain constant over time in control
subjects. The area under the curve for the derived Random Forest classifier was 0.90.
Conclusions: This 13-protein classifier is highly promising for detection of OPSCC prior to
overt symptoms.
Impact: Use of longitudinal samples has significant potential to identify biomarkers for
detection and risk stratification.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
3
Introduction
Head and neck squamous cell carcinomas (HNSCC) can arise from any tissue of the
upper aerodigestive tract (e.g., mouth, nose, pharynx, larynx, sinuses or salivary glands)1. It
ranks among the top-5 most prevalent cancers worldwide with approximately 835,000 incident
cases and 431,000 deaths estimated in 20182. Traditionally, HNSCC is a disease associated with
tobacco and alcohol use3, however, infection of the oropharynx with human papilloma virus
(HPV) is rapidly emerging as a significant new risk factor for oropharyngeal-specific HNSCC
and results in a disease clinically different from traditional HNSCC3. HPV-associated
oropharyngeal squamous cell carcinoma (OPSCC) has become a significant health issue in the
US military population4.
Early-stage HNSCC, including OPSCC, generally responds well to therapy, but two
thirds of new OPSCC cases are first diagnosed at advanced stage III or IV with lymph node
metastases3, when the overall five-year-survival rate is only ~50%
3, 5. This statistic has not
improved significantly for decades (except for in HPV-associated disease), and recurrence is a
major concern for stage III/IV HNSCC associated with elevated mortality5. Therefore, there is an
urgent need for effective biomarkers for early detection, risk stratification, and therapeutic
prognosis of HNSCC.
Previous attempts to identify biomarkers for early detection of HNSCC in biofluids
(saliva, serum, and/or plasma) have produced hundreds of candidate protein biomarkers6-9
, but
there are currently no FDA-approved biomarkers for early detection of HNSCC. Common issues
with previous studies include limited ability to analyze the number of clinical samples required
to avoid over-fitting, and the reliance on comparisons between cases and controls at a single
(diagnostic) timepoint, resulting in substantial overlap in the observed protein abundances
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
4
between cases and controls. The focus on comparison at the time of diagnosis may further limit
utility for early detection, as tumor burden is already significant at the time of diagnosis.
Longitudinal samples from the same individual over time allow each patient to serve as his/her
own control, enabling the detection of early changes within individual patient physiology as well
as alleviating across-patient heterogeneity and providing lower variation.
The Department of Defense Serum Repository (DODSR) was initiated in 1989 and is
comprised of serum samples from active and reserve military personnel drawn at enlistment,
biennially for routine HIV testing, and pre and post deployment throughout the service members’
participation in the Military Health System, accompanied by the service member’s electronic
health records. Previous studies have used the DODSR to investigate infectious diseases,
autoimmune diseases, multiple sclerosis, multiple myeloma and other cancers such as lymphoma
and breast and testicular cancers10
. Thus, the DODSR represents a unique resource for
longitudinal studies of cancer risk, progression, and response to therapy, especially for head and
neck cancers.
OPSCC represents an ideal target for further study both in terms of the growing
demographic with potential effects on military readiness4, but also because of the growing
understanding of cancer genomics related to HPV-associated and non-associated cancers.
Candidate biomarkers for OPSCC reported in the literature are generally based on small
opportunistic studies and have not been evaluated simultaneously in a large clinical cohort.
Reviews of the literature confirmed multiple proteins known to be mutated in head and neck
cancer, with a specific subset seen in HPV-positive oropharynx cancer, outlined in the 2015
article by Hayes et al11
. A recent study suggested several targets that can be seen in early
oropharyngeal cancer12
. Herein we employed an unbiased multiplexed mass spectrometry (MS)-
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
5
based targeted proteomics platform for precise identification of serum protein biomarkers,
selected from a rich resource of reported HNSCC biomarkers, and tested in a large cohort of
longitudinal DODSR serum specimens for early detection of HNSCC.
Materials and Methods
DODSR serum specimens
All DODSR serum samples used in this study were collected between 2003 and 2013, and
processed, aliquoted and stored using established protocols adopted by the DODSR13
. We
identified 3,160 cases that fit the OPSCC primary site and diagnosis date requirements, and this
number was reduced to 175 subjects who met the Active Duty requirement. We analyzed at least
two and up to four serum samples from each case. The refence specimen, designated “Dx”, is the
routine DODSR serum sample drawn closest to the time of the initial OPSCC diagnosis for cases
(within 1 year following the diagnosis date). The DODSR was searched for the routine blood
draw prior to the OPSCC diagnosis (PreB), and the routine blood draw before that one (PreA).
On average, this represented 41 years prior to diagnosis (PreA) and 21 years prior to diagnosis
(PreB), but the interval is not exact, due to logistic issues such as deployment. When available,
the routine blood draw subsequent to diagnosis was also retrieved (PostD). Corresponding
samples were also obtained from 175 healthy controls, matched by 1) age at the time of
diagnosis, 2) gender, and 3) time of the blood draw (within 1 year of each case’s specimen). In
total there were 978 serum samples. The experimental design and overall distribution of serum
collection timepoints per case and control groups is shown in Figure 1 and Table 1; Table S1
describes the clinical information of the serum samples in more detail. The average patient age
was 45 (ranging from 21 to 64) with 171 males and 4 females. This distribution reflects both the
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
6
difference in incidence of OPSCC in males and females, and the distribution of males and
females in the military population. Diagnoses were limited to ICD-O codes corresponding to
oropharyngeal primary tumors, irrespective of HPV status. Of the 175 patients, HPV status was
confirmed in 35 cases; 18 were confirmed HPV-positive and 17 were confirmed HPV-negative
(Table S1).
Selection of candidate OPSCC biomarkers for targeted proteomics
Candidate targets were selected by searching PubMed and Web of Science for
publications containing the keywords ‘biomarkers, HNSCC, oral cancer, oropharyngeal cancer,
head and neck cancer’ and curating the results using a genescraper R package14
. A total of 277
candidate biomarkers associated with either HNSCC specifically (131) or identified as general
cancer-associated proteins (146) were identified in the curated literature. Studies that focused on
serum, saliva, and tissue biomarkers in the diagnosis of HNSCC compared with controls were
considered. Thirty-six studies were identified, and 209 proteins were initially selected as OPSCC
candidate biomarkers (Table S2). Members of six signaling pathways (i.e., cell cycle
deregulation by HPV, TGF-β pathway, MAPK pathway, PI3K/AKT/mTOR pathway, NF-B
pathway, and Wnt pathway) reported to be closely related to OPSCC were also included. These
candidate markers were further evaluated by three clinician co-authors with expertise in head and
neck cancer. The clinical experts also added protein targets directly related to HPV pathogenesis
in OPSCC. We next evaluated the detectability of these protein markers in human serum or
plasma based on a recent in-depth plasma proteome dataset15
and unpublished in-house plasma
proteome datasets, and found 166 proteins that were detected in either or both plasma proteome
datasets. Candidate biomarkers were further down-selected in a stepwise workflow that included
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
7
both expert assessment of clinical relevance and actual MS detectability of the proteins in a
mixture of the DODSR serum samples. The final panel consisted of 146 candidate protein
biomarkers and targeted proteomic assays were developed for analysis in the large cohort of
DODSR samples using liquid chromatography coupled to selective reaction monitoring (LC-
SRM). Detailed description of SRM assay development and characterization can be found in
Supplementary Materials. The final assay conditions for scheduled LC-SRM analysis of the
146 candidate protein biomarkers are provided in Table S3.
Preparation of serum samples
The serum samples were subjected to immunoaffinity depletion for removal of 14 high
abundance proteins, followed by protein digestion, sample cleanup, and the addition of heavy
isotope labeled internal peptide standards for targeted proteomics analysis (Figure 2 and
Supplementary Materials). Two types of unblinded quality control (QC) were implemented to
validate the stability and reproducibility of the entire workflow: 1) external QC with the use of
commercially available serum (purchased from Biochemed Services (Winchester, VA), and 2)
internal QC with spiking of two exogenous bovine and yeast proteins into each immunoaffinity
depleted serum sample including the external QC serum samples. Detailed descriptions of the
external and internal QC standards and procedures and the detailed process for sample
randomization and order of analysis is described in Supplementary Materials.
LC-SRM measurements and data analysis
LC-SRM analysis was performed on a Waters nanoACQUITY UPLC system interfaced
to a Thermo Scientific TSQ Altis triple quadrupole mass spectrometer for simultaneous
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
8
quantification of a total of 308 peptides from the 146 candidate proteins and 2 exogenous
internal standard proteins across 978 serum samples in the retention time scheduled SRM mode.
Detailed descriptions of the LC-SRM analysis can be found in Supplementary Materials.
SRM data were analyzed using Skyline software. Peak detection and integration were
determined16
based on (1) same retention time; (2) approximately same relative SRM peak
intensity ratios across multiple transitions between light (L) peptides and heavy (H) peptide
standards. The L/H peak area ratios were automatically calculated by Skyline software17
and
exported to a csv file. All data were manually inspected to ensure correct peak detection and
accurate integration. Signal to noise ratio (S/N) was calculated by the peak apex intensity over
the mean background noise in a retention time region of ±15 s for the target peptides. The
background noise levels were conservatively estimated by visually inspecting chromatographic
peak regions. If the S/N ratios of light peptides were <3 or there was significant interference, the
target was reported as “Not Analyzed” using the abbreviation “NA”. If the endogenous light
peptide could not be detected, resulting in an L/H ratio of “zero” due to the lack of endogenous
light peptide signal, the target was designated as “Not Detectable” (ND). Data labeled as NA and
ND were not used for the statistical analysis. There were no significant outliers in this analysis,
as determined by Principle Component Analysis; specifically, there were no outliers attributable
to a high number of missing values (Figure S1). All the annotated Skyline output files have been
uploaded to Panorama website and are available via https://panoramaweb.org/1MV2jg.url.
Data processing and statistical analysis
The L/H ratios were log2-transformed and zero-centered, followed by batch correction
using ComBat18
, and peptide-to-protein rollup using RRollup19
. Quality of protein quantification
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
9
was estimated based on variances across clinical DODSR samples and external QC serum
replicate samples (detailed information on QC serum samples can be found in Supplementary
Material). Total variance represents both technical noise, due to variability in measurement, and
biological noise, due to biological differences between samples (i.e., inter-individual variability).
The variance measured in the technical replicates of the QC samples provides a good estimate of
technical noise. Thus, the ratio of the experimental sample variance to the QC sample variance
provides a reasonable estimate of the S/N. By eliminating any proteins from further analysis with
a S/N <2, we ensured that most of the contribution to protein abundance differences from sample
to sample was due to biological differences rather than technical noise. This resulted in 86
proteins passing the threshold.
For statistical significance testing we used protein relative abundances at individual
timepoints (PreA, PreB, Dx and PostD). Additionally, we used the longitudinal changes between
the timepoints (PreAPreB, PreBDx; and PreADx). The PostD timepoint was not used for
longitudinal analysis as it reflects treatment effect, rather than early detection. The t-tests were
based on limma R package20
that allows for empirical Bayes approach for better estimation of
protein variances. The tests were two-tailed and unpaired. To control for multiplicity of
hypothesis testing, the p-values (< 0.05 significance level) were adjusted using the Benjamini-
Hochberg approach21
, which is likely to be conservative as it does not account for the correlation
structure of the data22
.
A Random Forest approach was used to develop classifiers between case and control
groups. Features used for classification were based on protein relative abundances at each
individual timepoints (PreA, PreB, Dx and PostD) and the longitudinal changes observed
comparing the different timepoints: PreAPreB, PreADx, and PreBDx. Age at diagnosis
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
10
and sex of a subject were also considered as covariates. Prediction accuracy was estimated using
leave-one-out cross validation (LOOCV). During each round of LOOCV, we selected proteins
out of the entire set of 86 using the “boruta” algorithm (Boruta R package23
) on N-1 samples.
Then we trained a Random Forest classifier model (randomForest R package24
) using relative
abundances or log2 longitudinal change of selected proteins as features. For some analyses we
augmented protein data with age at diagnosis and sex as additional predictive features. The
trained Random Forest model then was used to compute the probability that the hold-out sample
belongs to the case class. Accuracy of the classifier was assessed using area under the curve
(AUC) (ROCR R package25
) composed from LOOCV predictions. The proteins selected into the
models in >50% of LOOCV rounds were carried into the final suggested signature.
Results
Multiplex quantification of candidate markers in the DODSR serum samples
Robust targeted proteomics assays were established for multiplex quantification of 146
candidate protein biomarkers across 489 serum samples from 175 OPSCC cases and the 489
serum samples from 175 matched controls. Table S4 provides the quantitation results for all 146
candidates and the 2 exogenous QC proteins across the entire set of 978 DODSR serum samples
and the external QC serum samples. The dispersion of SRM signals for surrogate peptides
derived from the 146 protein markers indicated relatively stable measurements (CV: 25.41%)
across all the external QC samples while the same measurement in the DODSR samples (both
cases and controls) showed much higher variation (CV: 42.81%) (Figure S2A, top panels),
reflecting good reproducibility of the entire workflow. This was further supported by highly
stable SRM signals for surrogate peptides derived from two internal QC proteins across all the
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
11
OPSCC (CV: 14.27%) and external QC samples (CV: 16.09%) (Figure S2A, bottom panels);
note that the external QC proteins were added after the immunoaffinity depletion step (Figure
2), hence having even lower variation compared to that in the target protein measurements. As
expected, there is no statistical difference for two internal QC proteins between cases and control
groups (Figure S2B).
Identification of proteomic classifiers for OPSCC
Relative protein abundance obtained from the L/H ratios of surrogate peptides across
different serum samples were first analyzed by t-test for the differentially abundant proteins
(Table S5). Between cases and controls, the number of statistically significant (adjusted p-values
≤0.05) proteins for static timepoints were 24 and 19 for the Dx and PostD timepoints,
respectively; none of the proteins in the PreA and PreB timepoints were statistically significant.
For the longitudinal abundance changes, the number of statistically significant proteins were 13
and 8 for PreBDx and PreADx comparisons, respectively (Table S5). All of the proteins
consistently selected for Dx or PreBDx classification models were significant in t-test. We
were unable to detect any statistically significant differences between confirmed HPV-positive
and confirmed HPV-negative cases, in part due to the limited number of patients with confirmed
HPV status. Similarly, we were unable to detect any statistically significant differences
correlating with smoking status.
Attempts to use the slope of protein abundance changes computed using actual sample
draw dates to compare between case and control groups were complicated by the variability in
the timing between the actual clinical diagnosis and the timing of the Dx blood draw. Thus, we
decided to model PreB versus Dx as two discrete states.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
12
A Random Forest machine learning approach was used to classify between cases and
controls based on relative abundance changes at each timepoint and longitudinal changes over
time. The AUCs for individual timepoints were 0.44 for PreA, 0.46 for PreB, 0.76 for Dx, and
0.73 for PostD (Figure 3A). For the longitudinal changes, the AUC values were 0.48 for
PreAPreB, 0.80 for PreADx, and 0.90 for PreBDx (Figure 3B). Since we were selecting
features independently for each round of LOOCV, we need to evaluate consistency of how often
a certain protein was selected into the Random Forest model (Figure 4). For the best performing
classification based on PreB Dx change (AUC 0.90), 8 proteins (SPARC, SERPINE1,
SERPIND1, SELE, LRG1, HPX, GSN, CP) were selected into the model in 100% of rounds.
Two more proteins (CTSH and CKM) were selected in >97% of rounds. Three proteins (AHSG,
SAA4, CD44) had minor contributions and were selected in <20% of rounds (Figure 4). We
selected ten proteins that were consistently selected into the classification models and passed a
50% threshold to be carried over for further signature validation. Of the models based on just one
timepoint, the Dx timepoint (closest to diagnosis) was the most predictive (AUC 0.76). Proteins
consistently (>50% LOOCV iterations) selected into the classification models based on Post1
abundance (SPARC, PKM, LRG1, GSN, CKM, HPX, MDH1, KNG1, SERPINE1) had
substantial overlap with consistent proteins from PreBDx models (Figure 4). Six proteins
(SPARC, SERPINE1, LRG1, GSN, CKM, HPX) are common between PreBDx and Dx
models; four proteins (CP, CTSH, SELE, SERPIND1) are unique to PreBDx models; three
proteins (KNG1, MDH1, PKM) are unique to Dx models. In total we found 13 proteins that are
consistently contributing to classification models based on longitudinal or cross-sectional data.
All of the proteins consistently selected for Dx and PreBDx classification models were
significant in t-test (see Table S5 and Figure 4). Out of the common six proteins, four (SPARC,
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
13
SEPRINE1, GSN, CKM) are lower in cases compare to controls at the Dx timepoint, two (LRG1
and HPX) are higher. The directionality of abundance change between PreB and Dx timepoints
is also the same. The correlation structure of protein abundances at the Dx timepoint and PreB
Dx changes is shown in Figure S3. The overlap between proteins significant in t-test and
selected for Random Forest classifier is substantial, but not 100%.
Table S6 presents the analytical performance of the SRM assays including the lower
limit of quantitation and measurement reproducibility for the 10 proteins consistently detected in
over 97% of OPSCC samples from the longitudinal change PreBDx. The overall achievable
reproducibility of the measurements over time for these 10 proteins was mostly <20% in CV.
The details of these experiments are provided in Supplementary Materials.
We evaluated the stability of abundance of the 13 proteins which are consistently
contributing to classification models based on longitudinal or cross-sectional data in the 36
control samples that have all four timepoints available. To test for changes over time we applied
regression analysis of protein abundance vs. actual date of blood draw with the subject being
modeled as a fixed effect. None of the proteins has a statistically significant association of
abundance with time in the control group (Table S7). Additionally, we computed coefficient of
variation as a measure of temporal stability. In the control group the coefficients of variation
over time for the 13 proteins were in the range 6-36% (Figure S4).
Considerations for patient stratification
Besides protein abundances, we considered age at diagnosis, sex and HPV status as
covariates in the analysis of stratification factors. Only a minority of case subjects had a
confirmed HPV diagnosis. Specifically, 17 cases were HPV-negative, 18 HPV-positive; 140 had
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
14
no diagnosis. Thus, most likely due to low power, no statistical difference was detected in
protein abundances between HPV-positive and HPV-negative cases. Given the male:female sex
ratio of the military population, the distribution of sex as a confounding factor was very skewed.
Out of 175 case/control pairs, only 4 pairs are female. Thus, including sex as a confounding
factor into the linear model or Random Forest classifier training had no effect on the results. Age
at diagnosis had no statistically significant effect on the results of differential abundance testing
and was not selected as predictive feature by the Boruta algorithm in Random Forest machine
learning approach.
Discussion
Many HNSCC protein biomarkers have been reported in tissues, saliva, and
serum/plasma6-9
, but none of them has been translated into clinical practice. A contributing factor
in this failure to validate and convert candidates into robust, FDA-approved biomarkers is the
relatively small set of clinical samples analyzed. MS-based targeted proteomics is highly
promising to bridge the gap between discovery and validation phases because it allows
quantification of hundreds of proteins simultaneously with high specificity and reproducibility,
without the need to generate affinity reagents. In this study, a stepwise approach was used to
transition from the curated literature to the finalized list based on detectability using LC-SRM
assays in real samples; the final 146 protein markers were measured by LC-SRM across 978
serum samples from OPSCC cases and matched controls. The robustness of the entire LC-SRM
workflow was monitored using two types of QCs to confirm precise quantification of target
proteins in the large cohort of DODSR serum samples. All these results demonstrated that the
entire LC-SRM workflow is robust for reliable quantification of the 146 protein biomarkers
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
15
across the 978 serum samples.
An important distinction of this study, compared to most biomarker discovery studies, is
the emphasis on longitudinal comparisons across timepoints, as compared to a focus on
differential protein expression between cases and controls at one single timepoint. Our
longitudinal study using the DODSR samples allowed us to identify biomarkers that differed
within the same individual across multiple timepoints. This approach has significant advantages
over conventional biomarker studies in terms of precise identification of biomarkers for early
detection of OPSCC and effectively addressing issues with human heterogeneity. Statistical
analysis of SRM data from the 978 DODSR serum samples supports this conclusion, as the SRM
data comparing cases and controls at one single timepoint AUC values is 0.76 and 0.73 for the
Dx and PostD timepoints, respectively, compared to an AUC of 0.90 for the classifier based on
longitudinal SRM data comparing for PreBDx within the same individual. For the PreADx
comparison, the AUC value is 0.80, suggesting that changes may be discernible as early as 4
years prior to diagnosis, but may not be sufficiently robust for clinical utility. An important
observation in this study is that abundance of these proteins in the classifier was extremely stable
over time in the control samples, indicating that a strategy of comparing biomarker abundance
measurements to the patient’s own serum protein profile over time may have significant clinical
utility. The observation of statistically significant changes across the longitudinal OPSCC serum
samples from the same individual appears to be a reliable indicator of true abundance changes,
which further validated our biomarker discovery.
Among the best classifiers for PreADx and PreBDx, there are seven overlapping
proteins: hemopexin (HPX), ferritin light chain (FTL), leucine-rich alpha-2-glycoprotein
(LRG1), plasminogen activator inhibitor 1 (SERPINE1), creatine kinase M-type (CKM), gelsolin
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
16
(GSN), and ceruloplasmin (CP). This suggests that they are more robust than the other protein
markers and highly promising for early detection of OPSCC. For example, increased expression
levels of FTL and ferritin heavy chain (FTH1) have been observed in HNSCC tumor and
HNSCC tumor with lymph node metastasis tissue samples using the chemiluminescent
immunoassay method26
. Leucine-rich alpha-2-glycoprotein was downregulated in HNSCC
tissues regardless of the grade of the tumor27
, and plasminogen activator inhibitor 1 was detected
in the secretome of head and neck/oral squamous cell carcinoma cell lines28
. The level of
creatine kinases such as creatine kinase M-type is a useful criterion for several diseases related to
HNSCC29, 30
. Gelsolin was one of five proteins on a diagnostic panel derived from comparing 61
oral squamous cell carcinoma (OSCC) and 58 control saliva samples31
. Hemopexin was
demonstrated to be differentially expressed between lymph-node metastatic and non-metastatic
OSCC serum samples31
. Ceruloplasmin was up-regulated in plasma samples of hypopharyngeal
squamous cell carcinoma32
.
Another important aspect of this study is the power of the combination of multiplexed
targeted proteomics and high-quality longitudinal clinical specimens for unbiased precise
prioritization of protein biomarkers starting with a large biomarker list curated from the
literature. The multiplexed biomarker discovery workflow can be easily implemented with
simultaneous quantification of ~500 candidate proteins for any type of cancer by full utilization
of the existing biomarker resource without the need to conduct additional discovery studies.
Furthermore, targeted proteomics is more effective than global proteomics for deep biomarker
discovery in serum due to the low concentration of potential biomarkers and the wide dynamic
concentration range in human serum or plasma. However, targeted proteomic studies may miss
other important OPSCC biomarkers because they may rely on an existing list of protein
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
17
biomarkers from the literature rather than an unbiased, in-depth global proteomics analysis with
extensive sample fractionation.
One important consideration in the generalization of these results is a comparison of the
characteristics of our study population to the general population. To the best of our knowledge,
there have been no published studies directly comparing the incidence rate or prevalence of
OPSCC in the general US population and the US military populations. However, it is relevant to
point out that the US military population has a higher incidence of high-risk HPV serotypes,
specifically HPV-16, than the general population, and also a higher prevalence of other risk
factors including smoking and alcohol use4, 33
. The study reported here is a retrospective study of
all available OPSCC cases in the DOD serum repository, and thus reflects the incidence of
OPSCC in the military population, both male and female. However, females appear to be
significantly under-represented in this study, compared to the expected incidence in females in
the general population (approximately one-quarter that in males)34
. This appears to reflect the
under-representation of females in the military population (15%)35
, rather than any differences in
incidence within female members of the military. Thus, it is likely that this study will be most
applicable to screening of high-risk males in the general population. It is worth noting that we
saw no statistically significant changes when we eliminated females from the analysis.
In conclusion, access to high-quality serum sample cohorts from the DODSR enabled
targeted proteomics discovery of promising biomarkers for early detection of OPSCC based on
longitudinal data from the same patient. Random Forest analysis of abundance data for146
candidate proteins simultaneously measured by robust SRM assays across 978 DODSR serum
samples indicated that comparisons across time in the same patient are superior to single
timepoint comparisons across patients. While direct comparison of protein abundances at the Dx
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
18
timepoint for cases and controls showed statistically significant differences in candidate
biomarker abundance with AUC >0.70, analysis of longitudinal samples provided substantially
improved AUC values of 0.90 for the PreBDx classifier and 0.80 for the PreADx classifier.
This is due to the observation that OPSCC patients displayed statistically significant changes in
protein abundance as early as two years prior to diagnosis, effectively captured in the 13-protein
classifier, but the abundance of these proteins was constant over time in the matched controls.
While the use of the longitudinal DODSR serum samples enabled the identification of
these highly promising markers for early detection of OPSCC, there may be some limitations on
the generalizability of this study. First, the military population is both younger and more
predominantly male than the general population at risk of OPSCC. Conversely, the military
population has an increased incidence of known risk factors for OPSCC, including increased
tobacco and alcohol use and an increased incidence of HPV infection33
. The use of p16 has
universally become a surrogate marker for HPV-driven OPSCC; however, due to the use of
serum samples dating back to 2003, most samples were not tested for HPV in the majority of the
cohort. The 34 reported cases were tested with in-situ hybridization for low and high-risk strains
of HPV; nevertheless, based on demographics alone this cohort would be considered to have a
high prevalence of HPV-driven OPSCC, perhaps with intermediate risk factors due to
concomitant tobacco use36
. Absence of HPV-specific markers such as E6 and E7 has been
documented elsewhere and postulated to be perhaps a marker of more advanced disease12
.
Another possibility is that presence of antibodies to E6 and E7 viral proteins may not correlate
with the actual proteins, if they are expressed transiently and subsequently degraded. Specific
data on smoking and alcohol use within this cohort is also incomplete, so that there is insufficient
power to analyze these confounding variables separately. Variations in the time interval between
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
19
clinical diagnosis and obtaining the Dx serum sample may have introduced variability related to
treatment effects, but this variation appears to be captured in the 95% confidence interval.
Validation in a large independent clinical cohort is needed to determine their general utility in
clinical practice, and thus justify the use of longitudinal serum analyses as a screening strategy
for the general population.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
20
Disclosure of Potential Conflicts of Interest
There are no potential conflicts of interest to disclose.
Authors' Contributions
Conception and design: J. Goodman, C.D. Shriver, T. Liu, K.D. Rodland.
Development of methodology: J.Y. Lee, V.A. Petyuk.
Acquisition of data: J.Y. Lee, T. Shi, A.A. Schepmoes, T.L. Fillmore.
Analysis and interpretation of data: J.Y. Lee, T. Shi, V.A. Petyuk, Y.-T. Wang, W. Cardoni, G.
Coppit, J. Goodman, C.D. Shriver, T. Liu, K.D. Rodland.
Writing, review, and/or revision of the manuscript: J.Y. Lee, T. Shi, V.A. Petyuk, J. Goodman,
C.D. Shriver, T. Liu, K.D. Rodland.
Administrative, technical, or material support: T.L. Fillmore, S. Srivastava.
Study supervision: C.D. Shriver, T. Liu, K.D. Rodland.
Acknowledgments
The authors would like to thank the clinical and laboratory staff at the Uniformed Services
University of the Health Sciences and Pacific Northwest National Laboratory (PNNL). Portions
of the research was performed in the Environmental Molecular Sciences Laboratory
(grid.436923.9), a U.S. Department of Energy (DOE) Office of Biological and Environmental
Research national scientific user facility on the PNNL campus. PNNL is a multiprogram national
laboratory operated by Battelle for the DOE under contract DE-AC05-76RL01830. The contents
of this publication are the sole responsibility of the author(s) and do not necessarily reflect the
views, opinions or policies of Uniformed Services University of the Health Sciences, The Henry
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
21
M. Jackson Foundation for the Advancement of Military Medicine, Inc., the Department of
Defense or the Departments of the Army, Navy, or Air Force. Mention of trade names,
commercial products, or organizations does not imply endorsement by the U.S. Government.
Grant Support
This work was supported by Federal Award No. HU0001-16-2-0014 (Subaward No. 3879 to
K.D. Rodland and T. Liu).
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
22
References
1. Marcu LG, Yeoh E. A review of risk factors and genetic alterations in head and neck
carcinogenesis and implications for current and future approaches to treatment. J Cancer Res
Clin Oncol. 2009;135: 1303-1314.
2. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics
2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185
countries. CA Cancer J Clin. 2018;68: 394-424.
3. Leemans CR, Braakhuis BJ, Brakenhoff RH. The molecular biology of head and neck cancer.
Nat Rev Cancer. 2011;11: 9-22.
4. Feinstein AJ, Shay SG, Chang E, Lewis MS, Wang MB. Treatment outcomes in veterans with
HPV-positive head and neck cancer. Am J Otolaryngol. 2017;38: 188-192.
5. Schaaij-Visser TB, Brakenhoff RH, Leemans CR, Heck AJ, Slijper M. Protein biomarker
discovery for head and neck cancer. J Proteomics. 2010;73: 1790-1803.
6. Li SX, Yang YQ, Jin LJ, Cai ZG, Sun Z. Detection of survivin, carcinoembryonic antigen and
ErbB2 level in oral squamous cell carcinoma patients. Cancer Biomark. 2016;17: 377-382.
7. Allegra E, Trapasso S, La Boria A, et al. Prognostic role of salivary CD44sol levels in the
follow-up of laryngeal carcinomas. J Oral Pathol Med. 2014;43: 276-281.
8. Pereira LH, Reis IM, Reategui EP, et al. Risk Stratification System for Oral Cancer Screening.
Cancer Prev Res (Phila). 2016;9: 445-455.
9. Hsiao YC, Chi LM, Chien KY, et al. Development of a Multiplexed Assay for Oral Cancer
Candidate Biomarkers Using Peptide Immunoaffinity Enrichment and Targeted Mass
Spectrometry. Mol Cell Proteomics. 2017;16: 1829-1849.
10. Perdue CL, Cost AA, Rubertone MV, Lindler LE, Ludwig SL. Description and utilization of
the United States department of defense serum repository: a review of published studies,
1985-2012. PLoS One. 2015;10: e0114857.
11. Hayes DN, Van Waes C, Seiwert TY. Genetic Landscape of Human Papillomavirus-
Associated Head and Neck Cancer and Comparison to Tobacco-Related Tumors. J Clin
Oncol. 2015;33: 3227-3234.
12. Tuhkuri A, Saraswat M, Makitie A, et al. Patients with early-stage oropharyngeal cancer can
be identified with label-free serum proteomics. Br J Cancer. 2018;119: 200-212.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
23
13. Perdue CL, Eick-Cost AA, Rubertone MV. A Brief Description of the Operation of the DoD
Serum Repository. Mil Med. 2015;180: 10-12.
14. https://github.com/evanamartin/genescraper.
15. Keshishian H, Burgess MW, Specht H, et al. Quantitative, multiplexed workflow for deep
analysis of human blood plasma and biomarker discovery by mass spectrometry. Nat Protoc.
2017;12: 1683-1701.
16. Song E, Gao Y, Wu C, et al. Targeted proteomic assays for quantitation of proteins identified
by proteogenomic analysis of ovarian cancer. Sci Data. 2017;4: 170091.
17. MacLean B, Tomazela DM, Shulman N, et al. Skyline: an open source document editor for
creating and analyzing targeted proteomics experiments. Bioinformatics. 2010;26: 966-968.
18. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using
empirical Bayes methods. Biostatistics. 2007;8: 118-127.
19. Polpitiya AD, Qian WJ, Jaitly N, et al. DAnTE: a statistical tool for quantitative analysis of -
omics data. Bioinformatics. 2008;24: 1556-1558.
20. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for
RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47.
21. Benjamini Y, Hochberg Y. Controlling the false discovery rate - a practical and powerful
approach to multiple testing. J. Royal Statist. Soc., Series B. 1995;57: 289-300.
22. Stevens JR, Al Masud A, Suyundikov A. A comparison of multiple testing adjustment
methods with block-correlation positively-dependent tests. PLoS One. 2017;12: e0176124.
23. Miron Kursa WR. Feature Selection with the Boruta Package. J Stat Softw. 2010;36: 1-13.
24. https://cran.r-project.org/web/packages/randomForest/index.html.
25. https://cran.r-project.org/web/packages/ROCR/index.html.
26. Hu Z, Wang L, Han Y, et al. Ferritin: A potential serum marker for lymph node metastasis in
head and neck squamous cell carcinoma. Oncol Lett. 2019;17: 314-322.
27. Wang Y, Chen C, Hua Q, et al. Downregulation of leucinerichalpha2glycoprotein 1
expression is associated with the tumorigenesis of head and neck squamous cell carcinoma.
Oncol Rep. 2017;37: 1503-1510.
28. Ralhan R, Masui O, Desouza LV, Matta A, Macha M, Siu KW. Identification of proteins
secreted by head and neck cancer cell lines using LC-MS/MS: Strategy for discovery of
candidate serological biomarkers. Proteomics. 2011;11: 2363-2376.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
24
29. Yamauchi K, Kogashiwa Y, Nagafuji H, Kohno N. Head and neck cancer with
dermatomyositis: a report of two clinical cases. Int J Otolaryngol. 2010;2010: 401825.
30. Cinamon U. Exceptionally elevated creatine kinase levels in a laryngectomized patient:
hypothyroid myopathy. J Laryngol Otol. 2004;118: 651-652.
31. Chen YT, Chen HW, Wu CF, et al. Development of a Multiplexed Liquid Chromatography
Multiple-Reaction-Monitoring Mass Spectrometry (LC-MRM/MS) Method for Evaluation of
Salivary Proteins as Oral Cancer Biomarkers. Mol Cell Proteomics. 2017;16: 799-811.
32. Tian WD, Li JZ, Hu SW, et al. Proteomic identification of alpha-2-HS-glycoprotein as a
plasma biomarker of hypopharyngeal squamous cell carcinoma. Int J Clin Exp Pathol.
2015;8: 9021-9031.
33. Agan BK, Macalino GE, Nsouli-Maktabi H, et al. Human papillomavirus seroprevalence
among men entering military service and seroincidence after ten years of service. MSMR.
2013;20: 21-24.
34. https://www.cdc.gov/cancer/hpv/statistics/cases.htm.
35. https://www.cfr.org/article/demographics-us-military.
36. Gillison ML, D'Souza G, Westra W, et al. Distinct risk factor profiles for human
papillomavirus type 16-positive and human papillomavirus type 16-negative head and neck
cancers. J Natl Cancer Inst. 2008;100: 407-420.
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
25
Table 1. Summary of the overall distribution of serum collection timepoints per case and control
groups.
Samples
PreA PreB Dx PostD
-4 years -2 years within 1 year +2 years
(1 year) (1 year) (+1 year) (1 year)
Cases
# of samples 158 157 77 97
Years
min -3.0 -1.1 0 1.1
avg S.D. -4.00.5 -2.00.4 0.40.3 1.90.5
max -4.9 -3.0 1.0 2.9
Controls
(1 year)
# of samples 158 157 77 97
Years
min -2.9 -0.4 -0.9 0.4
avg S.D. -4.10.5 -2.10.6 0.40.6 1.90.7
max -5.7 -3.5 1.8 3.2
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
26
Figure Legends:
Figure 1. The experimental design of the OPSCC study using serum samples from the DODSR.
Histograms representing number of subjects (Y axis) over time of sample collection (X axis).
Cases were binned into four timepoints for the comparative analysis. Controls were matched to
cases based on age at the time of diagnosis and the availability of serum samples matched on
time of blood draw. Each bin represents 2-month period.
Figure 2. Experimental workflow for identification of serum protein biomarkers for early
detection of OPSCC using multiplexed targeted proteomics. Both internal and external QCs were
used throughout the LC-SRM analysis to ensure robust, high-quality proteomics analysis.
Figure 3. The receiver operating curves (ROC) of the Random Forest classifiers for the single
timepoints (A) and for longitudinal changes between different timepoints (B). Both the AUC and
the ROC curve are provided with 95% confidence intervals for each comparison between the
case and control groups.
Figure 4. The selection frequency for the differential proteins being selected into the Random
Forest classifiers for the single timepoints (top row) and for longitudinal changes between
different timepoints (bottom row).
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192
Published OnlineFirst June 12, 2020.Cancer Epidemiol Biomarkers Prev JU YEON LEE, Tujin Shi, Vladislav A. Petyuk, et al. changes in serum protein abundanceDetection of head and neck cancer based on longitudinal
Updated version
10.1158/1055-9965.EPI-20-0192doi:
Access the most recent version of this article at:
Material
Supplementary
http://cebp.aacrjournals.org/content/suppl/2020/06/12/1055-9965.EPI-20-0192.DC1
Access the most recent supplemental material at:
Manuscript
Authoredited. Author manuscripts have been peer reviewed and accepted for publication but have not yet been
E-mail alerts related to this article or journal.Sign up to receive free email-alerts
Subscriptions
Reprints and
To order reprints of this article or to subscribe to the journal, contact the AACR Publications
Permissions
Rightslink site. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC)
.http://cebp.aacrjournals.org/content/early/2020/06/12/1055-9965.EPI-20-0192To request permission to re-use all or part of this article, use this link
on November 25, 2020. © 2020 American Association for Cancer Research. cebp.aacrjournals.org Downloaded from
Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. Author Manuscript Published OnlineFirst on June 12, 2020; DOI: 10.1158/1055-9965.EPI-20-0192