+ All Categories
Home > Documents > DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an...

DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an...

Date post: 18-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
17
DESNT: a Poor Prognosis Category of Human Prostate Cancer Bogdan-Alexandru Luca a,b, § , Daniel S Brewer b,c, § , ,* , Dylan R Edwards 2 , Sandra Edward d , Hayley C Whitaker e , Sue Merson d , Nening Dennis d , Rosalin A Cooper f , Steven Hazell g , Anne Y Warren h , The CancerMap Group i , Rosalind Eeles d,g , Andy G Lynch e , Helen Ross-Adams e , Alastair D Lamb e,j , David E Neal e,j , Krishna Sethia k , Robert D Mills k , Richard Y Ball l , Helen Curley b , Jeremy Clark b , Vincent Moulton a, ,* , Colin S Cooper b, ,* a School of Computing Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk, UK; b Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich, UK; c The Earlham Institute, Norwich Research Park, Norwich, Norfolk, UK; d Division of Genetics and Epidemiology, The Institute Of Cancer Research, Sutton, UK; e Urological Research Laboratory, Cancer Research UK Cambridge Research Institute, University of Cambridge, Cambridge, UK; f Department of Pathology, University Hospital Southampton NHS Foundation Trust, Southampton, UK; g Royal Marsden NHS Foundation Trust, London and Sutton, UK; h Department of Histopathology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK; i A list of participants and their affiliations appears in the Supplemental Information; j Department of Surgical Oncology, University of Cambridge, Addenbrooke's Hospital, Cambridge, UK; k Department of Urology, Norfolk and Norwich University Hospitals NHS Foundation Trust, Norwich, UK; l Department of Histopathology, Norfolk and Norwich University Hospitals NHS Foundation Trust, Norwich, UK. § These authors contributed equally to this work; These authors jointly supervised this work. *Corresponding Authors. Professor Vincent Moulton ([email protected]) , Dr Daniel Brewer ([email protected]) and Professor Colin Cooper ([email protected]) University of East Anglia, Norwich Research Park, Norwich, NR4 7UG, UK
Transcript
Page 1: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

DESNT:aPoorPrognosisCategoryofHumanProstateCancer

Bogdan-AlexandruLucaa,b, §,DanielSBrewerb,c,§,¶,*,DylanREdwards2,SandraEdwardd,HayleyCWhitakere,SueMersond,NeningDennisd,RosalinACooperf,StevenHazellg,AnneYWarrenh,TheCancerMapGroupi,RosalindEelesd,g,AndyGLynche,HelenRoss-Adamse,AlastairDLambe,j,DavidENeale,j,KrishnaSethiak,RobertDMillsk,RichardYBalll,HelenCurleyb,JeremyClarkb,VincentMoultona,¶,*,ColinSCooperb,¶,*

aSchoolofComputingSciences,UniversityofEastAnglia,NorwichResearchPark,Norwich,Norfolk,UK;bNorwichMedicalSchool,UniversityofEastAnglia,NorwichResearchPark,Norwich,UK;cTheEarlhamInstitute,NorwichResearchPark,Norwich,Norfolk,UK;dDivisionofGeneticsandEpidemiology,TheInstituteOfCancerResearch,Sutton,UK;eUrologicalResearchLaboratory,CancerResearchUKCambridgeResearchInstitute,UniversityofCambridge,Cambridge,UK;fDepartmentofPathology,UniversityHospitalSouthamptonNHSFoundationTrust,Southampton,UK;gRoyalMarsdenNHSFoundationTrust,LondonandSutton,UK;hDepartmentofHistopathology,CambridgeUniversityHospitalsNHSFoundationTrust,Cambridge,UK;i AlistofparticipantsandtheiraffiliationsappearsintheSupplementalInformation;jDepartmentofSurgicalOncology,UniversityofCambridge,Addenbrooke'sHospital,Cambridge,UK;kDepartmentofUrology,NorfolkandNorwichUniversityHospitalsNHSFoundationTrust,Norwich,UK;lDepartmentofHistopathology,NorfolkandNorwichUniversityHospitalsNHSFoundationTrust,Norwich,UK. §Theseauthorscontributedequallytothiswork; ¶Theseauthorsjointlysupervisedthiswork.*CorrespondingAuthors.ProfessorVincentMoulton([email protected]),DrDanielBrewer([email protected])andProfessorColinCooper([email protected])UniversityofEastAnglia,NorwichResearchPark,Norwich,NR47UG,UK

Page 2: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

AbstractBackground:Acriticalproblemintheclinicalmanagementofprostatecanceristhatitishighly heterogeneous. Accurate prediction of individual cancer behaviour is thereforenotachievableatthetimeofdiagnosisleadingtosubstantialovertreatment.Itremainsanenigmathat,incontrasttobreastcancer,unsupervisedanalysesofglobalexpressionprofiles has not currently defined robust categories of prostate cancer with distinctclinicaloutcomes.Objective:Todeviseanovelclassificationframeworkforhumanprostatecancerbasedonunsupervisedmathematicalapproaches.Design, Setting, and Participants: Our analyses are based on the hypothesis thatpreviousattemptstoclassifyprostatecancerhavebeenunsuccessfulbecauseindividualsamples of prostate cancer frequently have heterogeneous compositions. To addressthis issue we applied an unsupervised Bayesian procedure called Latent ProcessDecomposition to four independent prostate cancer transcriptome datasets obtainedusing samples from prostatectomy patients and containing between 78 and 182participants.Outcome Measurements and Statistical Analysis: Biochemical failure was assessedusinglog-rankanalysisandCoxregressionanalysis.Results and Limitations: Application of LPD identified a common process in all fourindependent datasets examined. Cancers assigned to this process (designated DESNTcancers)arecharacterizedbylowexpressionofacoresetof45genes,manyencodingproteins involved in the cytoskeletonmachinery, ion transport and cell adhesion. Forthe threedatasetswith linkedPSA failuredata followingprostatectomy,patientswithDESNT cancer exhibited poor outcome relative to other patients (P=2.65x10-5,P=4.28x10-5, and P=2.98x10-8). When these three datasets were combined theindependent predictive value of DESNT membership was P=1.61x10-7 compared toP=1.00x10-5forGleason sum.A limitationof the study is that only predictionof PSAfailurewasexamined.Conclusions:Ourresultsdemonstratetheexistenceofanovelpoorprognosiscategoryof human prostate cancer and will assist in the targeting of therapy, helping avoidtreatment-associatedmorbidityinmenwithindolentdisease.

Patient Summary: Prostate cancer, unlike breast cancer, does not have a robustclassification framework.We propose that this failure has occurred because prostatecancer samples selected for analysis frequently have heterozygous compositions(individual samples are made up of many different parts that each have differentcharacteristics).ApplyingamathematicalapproachthatcanovercomethisproblemweidentifyanovelpoorprognosiscategoryofhumanprostatecancercalledDESNT.

Keywords:poorprognosiscategory;novelprostatecancerclassification;DESNTprostatecancer;LatentProcessDecomposition

Words3383

Page 3: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

1.Introduction

RiskcategoriesbasedonPSA,GleasonscoreandClinicalStagethatpredictPSAfailure[1]underpinthetreatmentoflocalizedprostatecancer,asillustrated,forexample,bytheUK National Institute for Health and Care Excellence guidelines[2]. Attempts toimproved risk stratification have beenmade by the development of prognostic tests,such as Prolaris[3], Oncotype DX[4] and Decipher[5]. Most such expression-basedprognosticsignaturesforprostatecancerhaveincommonthattheywerederivedusingsupervised steps, involving either comparisons of aggressive and non-aggressivedisease[5,6] or the selection of genes representing specific biological functions[3,7,8].Alternatively expression biomarkers may be linked to the presence of somatic copynumber variations[9]. In contrast, for breast cancer, unsupervised analysis oftrancriptome profiles, using approaches such as hierarchical clustering has identifiedrobustdiseasecategoriesthathavedistinctclinicaloutcomesandthatrequiredifferenttreatmentstrategies[10].

Ourhypothesis is thatcompletelyunsupervisedclassificationofprostatecancerbasedon transcriptome data has not been successful previously[9,11] because individualsamplesofprostatecancercancontainmorethanonecontributinglineage[12,13]andfrequently have heterogeneous compositions[14-16]. To test this idea, in the currentstudy, we applied Latent Process Decomposition[17,18] (LPD). Based on the latentDirichletallocationmethod[19],LPDassessesthestructureofadatasetintheabsenceof knowledge of clinical outcome or biological role[17]. In contrast to standardunsupervised clustering models (e.g. k-means and hierarchical clustering), individualcancers are not assigned to a single cluster: instead gene expression levels in eachcanceraremodeledviacombinationsof latentprocesses. WepreviouslyusedLPDtoconfirm the presence of basal and ERBB2 overexpressing categories in breast cancerdatasets[17], and to show that, based on blood expression profiles, patients withadvancedprostatecancercanbestratifiedintotwoclinicallydistinctgroups[20].

2.MaterialsandMethods2.1TheCancerMapdatasetFresh prostate cancer specimens were obtained and processed from a systematic series ofpatientswhohadundergoneaprostatectomyattheRoyalMarsdenNHSFoundationTrustandAddenbrooke's Hospital, Cambridge as previously described[9,21,22]. The relevant localResearch Ethics Committee approvedwasobtained. Expressionprofilesweredetermined anddatawas processed as previously described[22] using 1.0 Human Exon ST arrays (Affymetrix,SantaClara,CA,USA)accordingtothemanufacturer’sinstructions.DataareavailablefromtheGeneExpressionOmnibus:GSE (data tobe releasedonpublication). CancerMappatientsdidnotreceiveneo-adjuvanttreatment.2.2AdditionalTranscriptomeDatasets

Page 4: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

We analysed five prostate cancer microarray datasets that will be referred to as: MSKCC,CancerMap, CamCap, Stephenson and Klein. The data used, platforms and location of clinicaldata are presented in Fig. 1b. Each dataset was obtained using samples from prostatectomypatients.CamCapdatasetusedinourstudywasproducedcombiningIlluminaHumanHT-12V4.0expressionbeadchip(beadmicroarray)datasets(GEO:GSE70768andGSE70769)obtainedfromtwo prostatectomy series (Cambridge and Stockholm) and consisted of 147 cancer and 73normalsamples[9].TheCamCapandCancerMapdatasetshaveincommon40patientsandthusarenot independent.OneRNAseqdatasetconsistingof333prostatecancersfromTheCancerGenomeAtlaswasanalysedwhichwe refer to as TCGA[13]. The countsper gene suppliedbyTCGAwereused.2.3LatentProcessDecompositionLatentprocessdecomposition (LPD) [17,18], anunsupervisedBayesianapproach,wasused toclassifysamples intosubgroupscalledprocesses.Weselectedthe500probesetswithgreatestvarianceacrosstheMSKCCdatasetforuseinLPD.Theseprobesetsmapto492genes.Foreachdataset all probesets that map to these genes were used in LPD analyses (CancerMap: 507probesets,CamCap:483,Stephenson:609). LPDcanobjectivelyassessthemostlikelynumberofprocesses.Weassessedthehold-outvalidation log-likelihoodof thedatacomputedatvariousnumberofprocessesandusedacombination of both the uniform (equivalent to a maximum likelihood approach) and non-uniform (MAP approach) priors to choose the number of processes. For robustness, werestartedLPD100timeswithdifferentseeds,foreachdataset.Outofthe100runsweselectedarepresentativerunthatwasusedforsubsequentanalysis.Therepresentativerun,wastherunwith thesurvival log-rankp-valueclosest to themode.For theKleindataset, forwhichwedonothaveclinicaldata,weusedthehold-outlog-likelihoodfromLPDinstead.2.4StatisticalTestsAllstatisticaltestswereperformedinRversion3.2.2(https://www.r-project.org/).Correlationsbetween the expression profiles between two datasets for a particular gene set and samplesubgroupwerecalculatedasfollows:

1. Foreachgeneweselectoneprobesetatrandom;2. foreachprobesetwetransformeditsdistributionacrossallsamplestoastandardnormal

distribution;3. the average expression for each probeset across the samples in the subgroup is

determined,toobtainanexpressionprofileforthesubgroup.4. the Pearson’s correlation between the expression profiles of the subgroups in the two

datasetsisdetermined.Differentiallyexpressedprobesetswereidentifiedusingamoderatedt-testimplementedinthelimmaRpackage[23].Genesareconsideredsignificantlydifferentiallyexpressediftheadjustedp-valuewasbelow0.01(pvaluesadjustedusingtheFalseDiscoveryRate). SurvivalanalyseswereperformedusingCoxproportionalhazardsmodels, the log-ranktest,andKaplan-Meierestimator,withbiochemicalrecurrenceafterprostatectomyastheendpoint. When several samples per patient were available, only the sample with the highestproportionoftumourtissuewasused.Multivariatesurvivalanalyseswereperformedwiththeclinical covariates Gleason grade (≤7 and >7), pathological stage (T1/T2 and T3/T4) and PSAlevels (≤10and>10).Wemodelled the variables thatdidnot satisfy theproportional hazardsassumption(T-stageinMSKCC),asaproductofthevariablewiththeheavysidefunction:

Page 5: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

where t0 is a time threshold. The multiplication of a predictor with the heavyside function,dividesthepredictorintotimeintervalsforwhichtheextendedCoxmodelcomputesdifferenthazardratios.BeforecarryingoutmultivariateanalysesweassessedcollinearitybetweentheDESNTpredictorandtheother traditional indicators.Todo thiswecalculated thevariance inflation factor (VIF) for each covariate in each model. VIF varied between1.005241and1.461661,suggestingaveryweakcorrelationbetweenthepredictors.2.5DrivinganoptimalpredictorofDESNTmembershipToderiveanoptimalpredictorofDESNTmembershipthedatasetswerepreparedsothattheywere comparable: probes were only retained if the associated gene was found in everymicroarray platform, only one randomly chosen probe was retained per gene and the batcheffectsadjustedusingtheComBatalgorithm[24].TheMSKCCdatasetwasusedasthetrainingset and other datasets as test sets. Gene selectionwas performed using regularized generallinearmodelapproach(LASSO)implementedintheglmnetRpackage[25],startingwithallgenesthat were significantly up or down regulated in DESNT in at least two of the total of fivemicroarraydataset(1669genes).LASSOwasrun100timesandonlygenesthatwereselectedinat least25%of runswereretained.Theoptimalpredictorwasthenderivedusingtherandomforestmodel[26] implemented in the randomForest R package[27]. Default parameters wereused, apart from the number of trees were set to 10001 and the class size imbalance wasadjustedforbydown-samplingthemajorityclasstothefrequencyoftheminorityclass3.Results3.1IdentificationoftheDESNTcancercategoryFour independent transcriptome datasets (designated MSKCC[11], CancerMap,Klein[28], and Stephenson[29], Fig. 1b) obtained fromprostatectomy specimenswereanalyzed. LPD was performed using between 3 and 8 underlying latent processescontributingtotheoverallexpressionprofileasindicatedfromlog-likelihoodplots(Fig.1b, Supplemental Fig. 1). Following the independent decomposition of each dataset,cancerswereassigned to individualprocessesbasedon theirhighestpi value yieldingthe results shown in Fig. 1a and Supplemental Fig. 2. pi is the contribution of eachprocess “i" to the expression profile of an individual cancer: sum of pi over allprocesses=1.Searchingforrelationshipsbetweenthedecompositions,asingleprocesswasidentifiedthat,basedoncorrelationsofgeneexpressionlevels,appearedtobecommonacrossallfour datasets (Fig. 1c). To further investigate this association, for each dataset, weidentified genes that were expressed at significantly lower or higher levels (P<0.01after correction for False Discovery Rate) in the cancers assigned to this processcomparedtoallothercancersfromthesamedataset.Thisunveiledasharedsetof45genes,allwith lowerexpression(Fig.2a,SupplementalTable1).Manyof theproteinsencoded by these 45 core genes are components of the cytoskeleton or regulate its

g(t) =1, if t ≥ t0

0, otherwise

⎧⎨⎪

⎩⎪

Page 6: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

dynamics,whileothersareinvolvedincelladhesionandiontransport(Fig.2b).Elevenof the45genesweremembersofpublishedprognosticsignatures forprostatecancer(Fig. 2c, Supplemental Data File 1). For exampleMYLK, ACTG2, and CNN1 are down-regulated inasignature forcancermetastasis[30],while lowerexpressionof TPM2 isassociatedwithpooreroutcomeaspartof theOncotypeDXsignature[4].Thecancersassigned to this commonprocessare referred toas “DESNT” (latinDEScenduNT, theydescend).3.2PatientswithDESNTcancersexhibitpoorprognosisUsing linked clinical data available for the MSKCC expression dataset we found thatpatients with DESNT cancer exhibited poor outcome when compared to patientsassigned to other processes (P=2.65x10-5, Log-rank test, Fig. 1d). Validation wasprovidedintwofurtherdatasetswherePSAfailuredatafollowingprostatectomywereavailable (Fig. 1d): for both the Stephenson and CancerMap datasets patients withDESNTcancerexhibitedpooroutcome(P=4.28x10-5andP=2.98x10-8respectively).ThenumberofcancersineachgroupisindicatedinthebottomrightcornerofeachKaplan-Meier plot. The number of patients with PSA failure is indicated in parentheses. Inmultivariate analysis, including Gleason sum, Stage and PSA, assignment as a DESNTcancer was an independent predictor of poor outcome in the Stephenson andCancerMapdatasets (P=1.83x10-4andP=3.66x10-3,Coxregressionmodel)butnot intheMSKCCdataset(P=0.327)(Table1,SupplementalFig.3).Whenthethreedatasetswere combined the independent predictive value of DESNT membership wasP=1.61x10-7 (Supplemental Fig. 3), compared to P=1.00x10-5 for Gleason sum.Includingsurgicalmarginstatusinthemultivariateanalysishadlittleinfluenceonthesevalues givingP=3.63x10-7 for DESNT compared toP=1.80x10-5for Gleason Sum. Thecombined multivariate model is a significant improvement over a baseline CoxproportionalhazardratiomodelcontainingGleason,PSAandClinicalStage(p=9.528x10-7; likelihood ratio test). The poor prognosis DESNT process was also identified in theCamCapdataset[9](Table1,SupplementalFig.3and4),whichwasexcludedfromtheabove analysis because it was not independent: there is a substantial overlap withcancersincludedinCancerMap(Fig.1b).3.3ArandomforestclassifierforidentifyingDESNTcancerWe wished to develop a classifier that, unlike LPD, was not computer processingintensiveandthatcouldbeappliedbothtoawiderrangeofdatasetsandtoindividualcancers. 1669 genes with significantly altered expression between DESNT and non-DESNT cancers in at least two datasets were selected for analysis. A LASSO logisticregressionmodelwas used to identify genes thatwere the best predictors of DESNTmembership in the MSKCC dataset leading to the selection of a set of 20 genes(Supplemental Table 2),which had a one geneoverlap (ACTG2) to the 45 geneswithsignificantlylowerexpressioninDESNTcancers.Usingrandomforest(RF)classificationthese 20 genes provided high specificity and sensitivity for predicting that individualcancerswereDESNTinboththeMSKCCtrainingdatasetandinthreevalidationdatasets(Supplemental Fig. 5). For the two validation datasets (Stephenson and CancerMap)

Page 7: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

with linked PSA failure data the predicted cancer subgroup exhibited poorer clinicaloutcome in both univariate andmultivariate analyses, in agreement with the resultsobservedusingLPD(Table1,Fig.3).3.4DESNTcancersintheTheCancerGenomeAtlasdatasetWhenRFclassificationwasappliedtoRNAseqdatafrom333prostatecancersdescribedby The Cancer Genome Atlas (TCGA)[13] a patient subgroup was identified that wasconfirmed as DESNT based on: (i) correlations of gene expression levels with DESNTcancergroupsinotherdatasets(SupplementalFig6);(ii)demonstrationofoverlapsofdifferentially expressed genes between DESNT and non-DESNT cancers with the coredown-regulatedgene set (45/45genes); and (iii) its poorer clinical outcomebasedonPSAfailure(P=5.4x10-4)comparedtonon-DESNTpatients(Table1,Fig.3e).For the TCGAdataset,we failed to find correlations between assignment as aDESNTcancerandthepresenceofanyspecificgeneticalteration(P>0.05aftercorrectionforFalseDiscoveryRate,χ²test,Fig.4).OfparticularnotetherewasnocorrelationtoETS-genestatus(P,=0.136,χ²test,Fig.4).AlackofcorrelationbetweenDESNTcancersandERG-gene rearrangement, determined using the fluorescence in situ hybridizationbreak-apartassay[31],wasconfirmedusingCancerMapsamples(LPD-DESNT,P=0.549;RF-DESNT,P=0.2623, χ² test:DESNT cancers identifiedby LPDandbyRF approachesare referred to respectively as LPD-DESNT and RF-DESNT). These observations areconsistent with the lack of correlation between ERG status and clinical outcome[32],although different views on the relationship between ERG-gene status and clinicaloutcomehavebeenexpressed[33]. SinceETS-genealteration, found inaroundhalfofprostate cancers[13,31], is considered to be an early step in prostate cancerdevelopment[15,34]itislikelythatchangesinvolvedinthegenerationofDESNTcancerrepresentalatereventthatiscommontobothETS-positiveandETS-negativecancers.ForRF-DESNTcancers in theTGCAseriesmanyof the45coregenesexhibitedalteredlevels of CpG gene methylation compared to non-RF-DESNT cancers (SupplementalTable3)suggestingapossibleroleincontrollinggeneexpression.Supportingthisidea,forsixteenofthe45coregenesepigeneticdown-regulationinhumancancerhasbeenpreviously reported, including six genes in prostate cancer (CLU, DPYSL3, GSTP1,KCNMA1,SNAI2, andSVIL) (Fig2b, Supplemental Table1).CpGmethylationof fiveofthegenes(FBLN1,GPX3,GSTP1,KCNMA1,TIMP3)haspreviouslybeenlinkedtocanceraggression.4.DiscussionEvidence from The European Randomized study of Screening for Prostate CancerdemonstratesthatPSAscreeningcanreducemortalityfromprostatecancerby21%[35].However, a critical problem is that the progression of prostate cancer is highlyheterogeneous[36,37]andPSAscreeningleadstothedetectionofupto50%ofcancersthat are clinically irrelevant[38,39]: that is cancers that would never have causedsymptoms in aman’s lifetime in the absence of screening. Unsupervised analyses ofbreast cancer datasets using hierarchical clustering previously revealed the existence

Page 8: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

basal, ERBB2-overexpressing and luminal cancer categories[10]. This mathematicalapproach has not proven successful when applied to prostate cancer microarraydatasets[9,11].HoweverinourstudytheuseofLPD,anunsupervisedmethodthattakesinto account the issueof cancerheterogeneity, has revealed theexistenceof a novelcategory of prostate cancer, designated DESNT, common across all datasets. ThesubsequentlinkingtoclinicaldatarevealedthatDESNTcancersexhibitpoorprognosis.Itwasnotable thatmembershipof theDESNTcancergroupswasnotan independentpredictorofclinicaloutcomeintheMSKCCdataset.ItispossiblethatthedifferencemaysimplyreflectstatisticalvariationsincethesizeoftheDESNTgroupinseveraldatasetswas small (MSKCC, 13%; CancerMap, 8%; Stephenson, 31%; Klein, 23%). Critically,however, when the datasets with linked clinical data were combined DESNTmembership remained an independent predictor of clinical outcome. We failed todetectsystematicdifferencesbetweenMSKCCandotherdatasetsused inmultivariateanalyses(SupplementalFig.3h).We have not, in this study, investigated the biological function and mechanisms ofalterations of expression of the 45 core genes. However gene down-regulationmediatedbyCpGmethylationiswelldocumentedinhumancancer,asistheassociationofCpGmethylationofsinglegeneswithaggressivecancerbehavior(SupplementalTable1). The results found for DESNT cancers are consistentwith these observations, butwouldsuggestthatitisthecombineunderexpressionofmultiplegenesthatrepresentsacriticaldeterminantofcancerprogressionandaggression.Severalofthegenesfoundto have lower expression in DESNT cancer (ACTA2, CNN1, LMOD1) encode proteinsprimarily expressed in smoothmuscle cells ormyofibroblast, indicative of an alteredtumour-stromalenvironment.WefailedtofindacorrelationbetweenstromalcontentandclinicaloutcomeintheCamCapandCancerMapdatasets(Fig.2).Howeverthisdoesnot exclude the possibility that DESNT cancers themselves may have lower stromalcontent,inpartexplainingthelowerexpressionofthesegenes.

Otherunder-expressedgenesencodecomponentsoftheactincytoskeletonorregulateitsdynamics(e.g.MLCK,MYL9,ACTN1,andTNS1).Increasedmalignancymaycorrelatewith increased cell migratory behaviour, which in turn can involve deployment ofparticulartypesofcelladhesionandcytoskeletalmachinery[40].Ahighdependencyonactomyosin contractility is recognised as a hallmark of amoeboid movement. Down-regulation of these genes inDESNT cancerswould argue against its involvement. Thelower expression of focal adhesion components such as integrin α5 (ITGA5), vinculin(VCL) and integrin-linked kinase (ILK), would also argue against involvement of"mesenchymal"typemigration,whichisdependentontheseclassesofgenes[40].Itisthus possible that the observed alterations may support involvement of collectivemigrationorexpansivegrowthphenotypes[40].Notably,wefailedtofindanyrelationshipbetweenDESNTcancersandeitherCNV(copynumber variant) signatures (Lalonde et al. and Ross-Adams et al. in Fig. 2c) or DNArepairgenealterations (Fig.4). Assignmentofcancerswithin theDESNTclassification

Page 9: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

framework togetherwith the use of standard clinical indicators (Stage, Gleason sum,PSA), CNV signatures[11], expression biomarkers such as Prolaris[3], Decipher[5], andOncotype DX[4] identified in supervised analyses and urine biomarkers[41], shouldsignificantlyenhancetheability identifypatientswhosecancersshouldbetargetedbyradical therapies, avoiding the side effects of treatment, including impotence, inmenwithnon-aggressivedisease. In futurestudiesweare focusingon thedevelopmentofbothLPDandRFbasedteststhatcanbeusedtodetectDESNTcancerinbiopsytissueinaclinicalsetting.

References

[1] D'AmicoAV.BiochemicalOutcomeAfterRadicalProstatectomy,ExternalBeamRadiationTherapy,orInterstitialRadiationTherapyforClinicallyLocalizedProstateCancer.Jama1998;280:969–74.doi:10.1001/jama.280.11.969.

[2] GrahamJ,KirkbrideP,CannK,HaslerE,PrettyjohnsM.Prostatecancer:summaryofupdatedNICEguidance.Bmj2014;348:f7524–4.doi:10.1136/bmj.f7524.

[3] CuzickJ,SwansonGP,FisherG,BrothmanAR,BerneyDM,ReidJE,etal.PrognosticvalueofanRNAexpressionsignaturederivedfromcellcycleproliferationgenesinpatientswithprostatecancer:aretrospectivestudy.LancetOncol2011;12:245–55.doi:10.1016/S1470-2045(10)70295-3.

[4] KleinEA,CooperbergMR,Magi-GalluzziC,SimkoJP,FalzaranoSM,MaddalaT,etal.A17-geneassaytopredictprostatecanceraggressivenessinthecontextofGleasongradeheterogeneity,tumormultifocality,andbiopsyundersampling.EurUrol2014;66:550–60.doi:10.1016/j.eururo.2014.05.004.

[5] ErhoN,CrisanA,VergaraIA,MitraAP,GhadessiM,BuerkiC,etal.Discoveryandvalidationofaprostatecancergenomicclassifierthatpredictsearlymetastasisfollowingradicalprostatectomy.PLoSONE2013;8:e66855.doi:10.1371/journal.pone.0066855.

[6] GlinskyGV,GlinskiiAB,StephensonAJ,HoffmanRM,GeraldWL.Geneexpressionprofilingpredictsclinicaloutcomeofprostatecancer.JClinInvest2004;113:913–23.doi:10.1172/JCI20032.

[7] TomlinsSA,AlshalalfaM,DavicioniE,ErhoN,YousefiK,ZhaoS,etal.Characterizationof1577primaryprostatecancersrevealsnovelbiologicalandclinicopathologicinsightsintomolecularsubtypes.EurUrol2015;68:555–67.doi:10.1016/j.eururo.2015.04.033.

[8] YouS,KnudsenBS,ErhoN,AlshalalfaM,TakharM,Al-DeenAshabH,etal.IntegratedClassificationofProstateCancerRevealsaNovelLuminalSubtypewithPoorOutcome.CancerRes2016;76:4948–58.doi:10.1158/0008-5472.CAN-16-0902.

[9] Ross-AdamsH,LambAD,DunningMJ,HalimS,LindbergJ,MassieCM,etal.Integrationofcopynumberandtranscriptomicsprovidesriskstratificationinprostatecancer:Adiscoveryandvalidationcohortstudy.EBioMedicine2015;2:1133–44.doi:10.1016/j.ebiom.2015.07.017.

Page 10: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

[10] SorlieT,TibshiraniR,ParkerJ,HastieT,MarronJS,NobelA,etal.Repeatedobservationofbreasttumorsubtypesinindependentgeneexpressiondatasets.ProcNatlAcadSciUSa2003;100:8418–23.doi:10.1073/pnas.0932692100.

[11] TaylorBS,SchultzN,HieronymusH,GopalanA,XiaoY,CarverBS,etal.Integrativegenomicprofilingofhumanprostatecancer.CancerCell2010;18:11–22.doi:10.1016/j.ccr.2010.05.026.

[12] CooperCS,EelesR,WedgeDC,VanLooP,GundemG,AlexandrovLB,etal.Analysisofthegeneticphylogenyofmultifocalprostatecanceridentifiesmultipleindependentclonalexpansionsinneoplasticandmorphologicallynormalprostatetissue.NatGenet2015;47:367–72.doi:10.1038/ng.3221.

[13] CancerGenomeAtlasResearchNetwork.TheMolecularTaxonomyofPrimaryProstateCancer.Cell2015;163:1011–25.doi:10.1016/j.cell.2015.10.025.

[14] BoutrosPC,FraserM,HardingNJ,deBorjaR,TrudelD,LalondeE,etal.Spatialgenomicheterogeneitywithinlocalized,multifocalprostatecancer.NatGenet2015;47:736–45.doi:10.1038/ng.3315.

[15] ClarkJ,AttardG,JhavarS,FlohrP,ReidA,De-BonoJ,etal.ComplexpatternsofETSgenealterationariseduringcancerdevelopmentinthehumanprostate.Oncogene2008;27:1993–2003.doi:10.1038/sj.onc.1210843.

[16] TsourlakisM-C,StenderA,QuaasA,KluthM,WittmerC,HaeseA,etal.HeterogeneityofERGexpressioninprostatecancer:alargesectionmappingstudyofentireprostatectomyspecimensfrom125patients.BMCCancer2016;16:641.doi:10.1186/s12885-016-2674-6.

[17] CarrivickL,RogersS,ClarkJ,CampbellC,GirolamiM,CooperC.IdentificationofprognosticsignaturesinbreastcancermicroarraydatausingBayesiantechniques.JRSocInterface2006;3:367–81.doi:10.1098/rsif.2005.0093.

[18] RogersS,GirolamiM,CampbellC,BreitlingR.ThelatentprocessdecompositionofcDNAmicroarraydatasets.IEEE/ACMTransComputBiolBioinform2005;2:143–56.doi:10.1109/TCBB.2005.29.

[19] BleiDM,NgAY,JordanMI.LatentDirichletAllocation.JournalofMachineLearningResearch2003;3:993–1022.

[20] OlmosD,BrewerD,ClarkJ,DanilaDC,ParkerC,AttardG,etal.PrognosticvalueofbloodmRNAexpressionsignaturesincastration-resistantprostatecancer:aprospective,two-stagestudy.LancetOncol2012;13:1114–24.doi:10.1016/S1470-2045(12)70372-8.

[21] WarrenAY,WhitakerHC,HaynesB,SanganT,McDuffusL-A,KayJD,etal.Methodforsamplingtissueforresearchwhichpreservespathologicaldatainradicalprostatectomy.Prostate2013;73:194–202.doi:10.1002/pros.22556.

[22] JhavarS,ReidA,ClarkJ,Kote-JaraiZ,ChristmasT,ThompsonA,etal.DetectionofTMPRSS2-ERGtranslocationsinhumanprostatecancerbyexpressionprofilingusingGeneChipHumanExon1.0STarrays.JMolDiagn2008;10:50–7.doi:10.2353/jmoldx.2008.070085.

[23] RitchieME,PhipsonB,WuD,HuY,LawCW,ShiW,etal.limmapowersdifferentialexpressionanalysesforRNA-sequencingandmicroarraystudies.NucleicAcidsRes2015;43:e47–7.doi:10.1093/nar/gkv007.

Page 11: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

[24] JohnsonWE,LiC,RabinovicA.AdjustingbatcheffectsinmicroarrayexpressiondatausingempiricalBayesmethods.Biostatistics2007;8:118–27.doi:10.1093/biostatistics/kxj037.

[25] FriedmanJ,HastieT,TibshiraniR.RegularizationPathsforGeneralizedLinearModelsviaCoordinateDescent.JStatSoftw2010;33:1–22.doi:10.1109/TPAMI.2005.127.

[26] BreimanL.RandomForests.MachineLearning2001;45:5–32.doi:10.1023/A:1010933404324.

[27] LiawA,WienerM.ClassificationandregressionbyrandomForest.RNews2002.[28] KleinEA,YousefiK,HaddadZ,ChoeurngV,BuerkiC,StephensonAJ,etal.A

genomicclassifierimprovespredictionofmetastaticdiseasewithin5yearsaftersurgeryinnode-negativehigh-riskprostatecancerpatientsmanagedbyradicalprostatectomywithoutadjuvanttherapy.EurUrol2015;67:778–86.doi:10.1016/j.eururo.2014.10.036.

[29] StephensonAJ,SmithA,KattanMW,SatagopanJ,ReuterVE,ScardinoPT,etal.Integrationofgeneexpressionprofilingandclinicalvariablestopredictprostatecarcinomarecurrenceafterradicalprostatectomy.Cancer2005;104:290–8.doi:10.1002/cncr.21157.

[30] RamaswamyS,RossKN,LanderES,GolubTR.Amolecularsignatureofmetastasisinprimarysolidtumors.NatGenet2003;33:49–54.doi:10.1038/ng1060.

[31] TomlinsSA,RhodesDR,PernerS,DhanasekaranSM,MehraR,SunX-W,etal.RecurrentfusionofTMPRSS2andETStranscriptionfactorgenesinprostatecancer.Science2005;310:644–8.doi:10.1126/science.1117679.

[32] WeischenfeldtJ,SimonR,FeuerbachL,SchlangenK,WeichenhanD,MinnerS,etal.Integrativegenomicanalysesrevealanandrogen-drivensomaticalterationlandscapeinearly-onsetprostatecancer.CancerCell2013;23:159–70.doi:10.1016/j.ccr.2013.01.002.

[33] ClarkJP,CooperCS.ETSgenefusionsinprostatecancer.NatRevUrol2009;6:429–39.doi:10.1038/nrurol.2009.127.

[34] ParkK,DaltonJT,NarayananR,BarbieriCE,HancockML,BostwickDG,etal.TMPRSS2:ERGgenefusionpredictssubsequentdetectionofprostatecancerinpatientswithhigh-gradeprostaticintraepithelialneoplasia.JClinOncol2014;32:206–11.doi:10.1200/JCO.2013.49.8386.

[35] SchröderFH,HugossonJ,RoobolMJ,TammelaTLJ,ZappaM,NelenV,etal.Screeningandprostatecancermortality:resultsoftheEuropeanRandomisedStudyofScreeningforProstateCancer(ERSPC)at13yearsoffollow-up.Lancet2014;384:2027–35.doi:10.1016/S0140-6736(14)60525-0.

[36] D'AmicoAV.Cancer-SpecificMortalityAfterSurgeryorRadiationforPatientsWithClinicallyLocalizedProstateCancerManagedDuringtheProstate-SpecificAntigenEra.JournalofClinicalOncology2003;21:2163–72.doi:10.1200/JCO.2003.01.075.

[37] BuyyounouskiMK,PicklesT,KestinLL,AllisonR,WilliamsSG.Validatingtheintervaltobiochemicalfailurefortheidentificationofpotentiallylethalprostate

Page 12: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

cancer.JClinOncol2012;30:1857–63.doi:10.1200/JCO.2011.35.1924.[38] DraismaG,EtzioniR,TsodikovA,MariottoA,WeverE,GulatiR,etal.Leadtime

andoverdiagnosisinprostate-specificantigenscreening:importanceofmethodsandcontext.JNatlCancerInst2009;101:374–83.doi:10.1093/jnci/djp001.

[39] EtzioniR,GulatiR,MallingerL,MandelblattJ.InfluenceofStudyFeaturesandMethodsonOverdiagnosisEstimatesinBreastandProstateCancerScreening.AnnalsofInternalMedicine2013;158:831–8.doi:10.7326/0003-4819-158-11-201306040-00008.

[40] FriedlP,LockerJ,SahaiE,SegallJE.Classifyingcollectivecancercellinvasion.NatCellBiol2012;14:777–83.doi:10.1038/ncb2548.

[41] VanNesteL,HendriksRJ,DijkstraS,TrooskensG,CornelEB,JanninkSA,etal.DetectionofHigh-gradeProstateCancerUsingaUrinaryMolecularBiomarker-BasedRiskScore.EurUrol2016;70:740–8.doi:10.1016/j.eururo.2016.04.012.

Acknowledgement of Support: This work was funded by the Bob Champion CancerTrust, The Masonic Charitable Foundation successor to The Grand Charity, The KingFamily, and TheUniversity of East Anglia.We acknowledge support fromMovember,fromProstateCancerUK,CallumBarton,TheBigCCancerCharity,andfromTheAndyRipleyMemorialFund.TheresearchpresentedinthispaperwascarriedoutontheHighPerformance Computing Cluster supported by the Research and Specialist ComputingSupport service at the University of East Anglia. Cancer Research UK Grant 10047funded the generationof theprostateCancerMapexpressionmicroarraydataset.Wewould like to acknowledge the support of the National Institute for Health Research(NIHR) which funds the Cambridge Bio-medical Research Centre, Cambridge UK. Thesponsors did not participate in the design and conduct of the study; data collection,management, analysis, and interpretation; and manuscript preparation, review, andapproval

Page 13: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

FIGURESANDTABLES

Figure1.LatentProcessDecomposition(LPD),genecorrelationsandclinicaloutcome.a,LPDanalysisofAffymetrix expression data from the MSKCC datasets divided the samples into eight processes, eachrepresentedherebyabarchart.Samplesare represented inalleightprocessesandheightofeachbarcorrespondstotheproportion(pi)ofthesignaturethatcanbeassignedtoeachLPDprocess.SamplesareassignedtotheLPDgroupinwhichtheyexhibitthehighestvalueofpi.LPDwasperformedusingthe500gene probes with the greatest variation in expression between samples in the MSKCC dataset. TheprocesscontainingDESNTcancersisindicated.b,ListofdatasetsusedinLPDanalysis.Theuniquenumberof primary cancer and normal specimens used in LPD are indicated. FF, fresh frozen specimen; FFPE,formalin-fixedparaffinembeddedspecimen.TheCancerMapandCamCapwerenot independenthaving40 cancers in common. Clinical and molecular details for the CancerMap dataset are given inSupplementalTable4andSupplementalDataFile2.Clinicaldetailsforsamplesfromotherdatasetsusedin this study can be found in Supplemental Data File 3. c, Correlations of average levels of geneexpression between cancers designated as DESNT. All six comparisons for the MSKCC, CancerMap,Stephenson and Klein datasets are shown. The expression levels of each gene have been normalisedacrossallsamplestomean0andstandarddeviation1.d,Kaplan-MeierPSAfailureplotsfortheMSKCC,CancerMapandStephensondatasets.

a

Pearson's corr. 0.658

�2

�1

0

1

2

�2 �1 0 1 2CancerMap � DESNT

MS

KC

C �

DE

SN

T

Pearson's corr. 0.799

�2

�1

0

1

2

�2 �1 0 1 2Stephenson � DESNT

MS

KC

C �

DE

SN

T

Pearson's corr. 0.753

�2

�1

0

1

2

�2 �1 0 1 2Klein � DESNT

MS

KC

C �

DE

SN

T

Pearson's corr. 0.782

�2

�1

0

1

2

�2 �1 0 1 2CancerMap � DESNT

Ste

phen

son �

DE

SN

T Pearson's corr. 0.595

�2

�1

0

1

2

�2 �1 0 1 2CancerMap � DESNT

Kle

in �

DE

SN

T

Pearson's corr. 0.726

�2

�1

0

1

2

�2 �1 0 1 2Stephenson � DESNT

Kle

in �

DE

SN

T

b

c

d

Dataset Primary Normal Type LPD proc. Platform CitationMSKCC 131 29 FF 8 Affymetrix Exon 1.0 ST Taylor et. al. 2010CancerMap 137 17 FF 8 Affymetrix Exon 1.0 ST -Stephenson 78 11 FF 3 Affymetrix U133A Stephenson et. al. 2005Klein 182 0 FFPE 5 Affymetrix Exon 1.0 ST Klein et. al. 2015CamCap 147 73 FF 6 Illumina HT12 v4 BeadChip Ross-Adams et. al. 2015

0.0

0.5

1.0LP

D 1

DES

NT

0.0

0.5

1.0

LPD

2

0.0

0.5

1.0

LPD

3

0.0

0.5

1.0

LPD

4

0.0

0.5

1.0

LPD

5

0.0

0.5

1.0

LPD

6

0.0

0.5

1.0

LPD

7

0.0

0.5

1.0

LPD

8

Gleason7�6 �8 Normal tissue NA

Log�rank P�value: 2.65×10�5

MSKCC

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100 125Number of months

BCR

free

sur

viva

l

DESNT: 17 (9)Non�DESNT: 114 (18)

17 11 4 1 0 0114 92 53 22 9 3Non�DESNT

DESNT

Numbers at risk

Log�rank P�value: 4.28×10�5

Stephenson

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100Number of months

BCR

free

sur

viva

lDESNT: 24 (19)Non�DESNT: 54 (19)

24 12 7 0 054 43 39 16 2Non�DESNT

DESNT

Numbers at risk

Log�rank P�value: 2.98×10�8

CancerMap

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100 125Number of months

BCR

free

sur

viva

l

DESNT: 10 (8)Non�DESNT: 125 (25)

10 4 1 0 0 0125 107 84 17 4 1Non�DESNT

DESNT

Numbers at risk

Page 14: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

Figure 2. Genes commonly down-regulated in DESNT poor prognosis prostate cancer. a, Number ofgenes with significantly altered expression in DESNT cancers compared to non-DESNT cancers (P<0.01after correction for FalseDiscoveryRate). 45 geneshad lower expression inDESNT cancers in all fourexpressionmicroarraydatasets,basedonastringencyrequirementofbeingdown-regulatedinatleast80of100independentLPDruns.b,Listofthe45genesaccordingtobiologicalgrouping.PreviouspublishedevidenceisrepresentedassuperscriptsandthesupportingreferencesareprovidedinSupplementalTable1. Encodedprotein functions are shown in Supplemental Table 5. Although someof the 45 genes arepreferentiallyexpressed instromaltissuewefoundnocorrelationbetweenstromalcontentandclinicaloutcome in both the CancerMap andCamCappatient series,where data on cellular compositionwereavailable. When patients were stratified into two groups (above and below median stromal content)Kaplan-Meierplots failed toshowoutcomedifference forboth theCancerMap (Log-rank test,p=0.159)and CamCap (p=0.261) patient series. c. Relationship between the genes in published poor prognosissignaturesforprostatecancerandtheDESNTclassificationforhumanprostatecancer,representedasacircosplot.Linkstothe45commonlydown-regulatedgenesareshowninbrown.Referencesquotedinthecircosplotare listed in theSupplemental Informationanddetailedgenerelationshipsareshown inSupplementalDataFile1.

Page 15: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

Figure3.AnalysisofoutcomeforDESNTcancersidentifiedbyRFclassification.(a-e)Kaplan-MeierPSAfailureplots for theMSKCC (a),CancerMap (b), Stephenson (c),CamCap (d)andTCGA (e)datasets.ForeachdatasetthecancersassignedtoDESNTusingthe20geneRFclassifierarecomparedtotheremainingcancers.Thenumberofcancers ineachgroup is indicated in thebottomrightcornerofeachplot.ThenumberofcancerswithPSAfailureisindicatedinparentheses.Multivariateanalyseswereperformedasdescribed in theMethods for theMSKCC (f), CancerMap (g), Stephenson (h), CamCap (i) and TCGA (j)datasets.PathologicalStagecovariatesforMSKCCandStephensondatasetsdidnotmeettheproportionalhazards assumptions of the Cox model and have been modelled as time-dependent variables, asdescribedintheMethods.

Non-DESNT/DESNTGleason: �/> 7

PSA: �/> 10Stage < 35 weeks: T2/T3-T4Stage � 35 weeks: T2/T3-T4

6.05×10�1

2.42×10�4

6.32×10�1

3.30×10�3

5.48×10�1

Non-DESNT/DESNTGleason: �/> 7

Stage: T1-T2/T3-T4PSA: �/> 10

1.45×10�2

5.63×10�4

1.18×10�1

6.60×10�1

Non-DESNT/DESNTStage: T1-T2/T3-T4

Gleason: �/> 7PSA: �/> 10

4.56×10�4

2.15×10�1

2.34×10�2

7.94×10�2

Non-DESNT/DESNTStage: T1-T2/T3-T4

Gleason: �/> 7PSA: �/> 10

1.31×10�4

1.34×10�2

6.03×10�7

1.17×10�2

Non-DESNT/DESNTGleason: �/> 3+4

Path Stage: T1-T2/T3-T4

2.59×10�2

4.04×10�1

1.43×10�2

b

d

e

f

g

h

i

a

c

j

Stephenson

MSKCC

CancerMap

CamCap

TCGA

0.1 0.5 1 2 3 5 7 9 20Hazard Ratio

P-value

P-value

P-value

P-value

P-value

0.1 0.5 1 2 3 5 7 9 20Hazard Ratio

0.1 0.5 1 2 3 5 7 9 20Hazard Ratio

0.1 0.5 1 2 3 5 7 9 20Hazard Ratio

0.1 0.5 1 2 3 5 7 9 20Hazard Ratio

Log�rank P�value: 1.86×10�3

MSKCC

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100 125Number of months

BCR

free

sur

viva

l

RF: Non�DESNT: 108 (18)RF: DESNT: 23 (9)

108 88 51 22 9 323 15 6 1 0 0DESNT

Non�DESNT

Numbers at risk

Log�rank P�value: 4.8×10�4

CancerMap

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100 125Number of months

BCR

free

sur

viva

l

RF: Non�DESNT: 105 (19)RF: DESNT: 30 (14)

105 92 71 13 2 130 19 14 4 2 0DESNT

Non�DESNT

Numbers at risk

Log�rank P�value: 1.61×10�5

CamCap

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100Number of months

BCR

free

sur

viva

l

RF: Non�DESNT: 155 (41)RF: DESNT: 50 (23)

155 110 67 29 250 17 8 1 0DESNT

Non�DESNT

Numbers at risk

Log�rank P�value: 1.73×10�4

Stephenson

0.00

0.25

0.50

0.75

1.00

0 25 50 75 100Number of months

BCR

free

sur

viva

l

RF: Non�DESNT: 53 (19)RF: DESNT: 25 (19)

53 42 38 15 225 13 8 1 0DESNT

Non�DESNT

Numbers at risk

Log�rank P�value: 5.4×10�4

TCGA

0.00

0.25

0.50

0.75

1.00

0 1000 2000 3000 4000Number of days

BCR

free

sur

viva

l

RF: Non�DESNT: 252 (21)RF: DESNT: 81 (19)

252 54 9 4 081 21 4 1 1DESNT

Non�DESNT

Numbers at risk

Page 16: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

Figure4.ComparisonofRF-DESNTandnon-RF-DESNTcancersinTheCancerGenomeAtlasdataset.A20generandomforest(RF)classifierwasusedtoidentifyDESNTcancers(designatedRF-DESNTcancers).Thetypesofgeneticalterationareshownforeachgene(mutations,fusions,deletions,andoverexpression).Clinicalparametersincludingbiochemicalrecurrence(BCR)arerepresentedatthebottomtogetherwithgroupsforiCluster,methylation,somaticcopynumberalteration(SVNA)andmRNA[13].Whenmutationsand homozygous deletions for each gene were combined RF-DESNT cancers contained an excess ofgenetic alterations inBRCA2 (P=0.021, χ² test) andTP53 (P=0.0038), but after correcting formultipletestingthesedifferenceswerenotsignificant(P>0.05).

Page 17: DESNT: a Poor Prognosis Category of Human …Latent process decomposition (LPD) [17,18], an unsupervised Bayesian approach, was used to classify samples into subgroups called processes.

Table1:PoorclinicaloutcomeofpatientswithDESNTcancer

For each dataset comparisons were made between PSA failures reported for DESNT and non-DESNTcancers.LPD,LatentProcessDecomposition;RF,RandomForest.ForLPDthelog-rankP-valuesrepresentthe modal LPD run selected from the 100 independent LPD runs as described in the Methods. FormultivariateanalysesGleasonsum,PSAatdiagnosisandPathologicalStageare includedforalldatasetswiththeexceptionoftheTCGAdatasetwhereonlyGleasonsumandClinicalStagedatawereavailable.ThefullanalysesarepresentedinFig.3andSupplementalFig.3.

LatentProcessDecomposition

Dataset Univariatep-value Multivariatep-valueMSKCC 2.65×10−5 3.27×10−1CancerMap 2.98×10−8 3.66×10−3Stephenson 4.28×10−5 1.83×10−4CamCap 1.22×10−3 2.90×10−2

RandomForestDataset Univariatep-value Multivariatep-valueMSKCC 1.85×10−3 6.05×10−1CancerMap 4.80×10−4 1.45×10−2Stephenson 1.75×10−4 4.56×10−4CamCap 1.61×10−5 1.31×10−4TCGA 5.41×10−4 2.59×10−2


Recommended