ProgresswithRareDiseasesintheGenomicsEngland100,000GenomesProject
DrAnnaCNeedGeCIPTeamLead,GenomicsEngland
GenomicsEnglandannouncedbySecretaryofStateforHealthinspeechduringNHS65th AnniversaryCelebrations,July2013
AnnouncedbytheformerPrimeMinisterinDecember2012AnOlympicLegacy
The100,000GenomesProject
CMO’sGenerationGenomeandtheLifeSciencesreportin2017
OpeningofnewSequencingCentrein2016
The100,000GenomesProject
324 May 2018
424May2018
• Nationwidenetworkof13NHSGenomicMedicineCentres – eachserving~3-5millionpopulation
• Includesover85hospitalsacrossEngland
• Integratedwithgeneticlaboratories,geneticservicesandlocalpathologylaboratories
• Scotland,NIandWalesalsonowpartoftheProject
Theinfrastructurefordelivery
524May2018
Howthe100,000GenomesProjectworks
DiscoveryForumIndustryUsers
Whatinformationcanbefedback?
• Informationaboutapatient’smaincondition
• Informationaboutadditional‘seriousandactionable’conditions(optional)
• Carrierstatusfornonaffectedparentsofchildrenwithraredisease(optional)
624May2018
ImagecourtesyofHealthEducationEngland
24/05/2018 7724May2018
Primaryclinicaldatacollection• Coreclinicaldataset:
• Diseasestatus(bespoke)• Pedigreedata(Panogram)• HumanPhenotypingOntology• ICD10,SNOMEDCT,OMIM
• ClinicaltestdatawhererelevantandnotcapturedbyHPO
• Usingestablishedstandardswhereverpossible
Additionaldataemulatingtestrequestinclinicalpractice
GenepanelsPenetrancesettingsOtherbespokeelements
e.g.>1analysisforfamily
PanelApphttps://panelapp.genomicsengland.co.uk/
924May2018
Patient/family
Phenotypes&PedigreeDNA
Genomesequence
AnnotatedVCFs
Tieredvariants
Gene PanelVariant filtering
AnnotationCompanies
Review
GenePanels
Clinicalreporting GeCIP(s)
ValidationOutcomes
Decisionsupporttool
Semi-automatedInterpretationpipeline
ReportQA
Panels+settings
WORKFLOWMANAGER
DATADISTRIBUTIONFRAMEWORK
PanelAssigner
PanelApp
111124May2018
Interpretationatthehospitals
1. ‘Tiering’– Automated focusedinitialanalysissetuppriortointerpretation– Aimstomirrorstandarddiagnosticanalysis– ‘Tier1or2'– variantsin0-5geneswithinprescribedpanel(s)(median=1.2)– Non-penetrancepipelinecanberun
2. Broaderanalysis– Decisionsupportsoftwareallowshospitalsfurther,bespokeanalysis– ‘Tier3’– standardGeL pipelinebutnotrestrictedtogenepanels;20-100svariants(median=285)– Othertools,dependentonwhichCIPsystem
3. Lab-clinicalteamcuratevariantsandrecordoutcomes– Curations(includingACMGclassifications)savedtocentralknowledgebase– Recordclinicalimpactsofdiagnosticresult
1224 May 2018
Progresstodate Figures as at 09/05/2018
Samples
85,898SamplescollectedfromNHSGMCs
Genomes AnalysisandReports
11,883familiessenttoNHSGMCs
24,596
Reportsfor
Equivalentto
genomes66,447
19,451
5,677genomessincelastmonth
20-25%actionablefindings
Genomessequenced
49,272
60,679
11,407
Recruitment• Average~500participants/week• Rarediseaserecruitmentexpectedtoend30thSeptember2018• Meanfamilysize=2.3
13
0500100015002000250030003500400045005000
Numberoffamilies
Recruitmentpatterns
14
Recruitment-WestMidlands
15
Recruitment-WestMidlands
16
DiagnosticyieldindifferentdiseaseareasDisease SINGLETONS
No.familiesSINGLETONSDiagnosticyield(%)
TRIOSNo.families
TRIOSDiagnosticyield(%)
ALLNo.families
ALLDiagnosticyield(%)
Intellectualdisability 8 25 38 39.5 63 36.5Rod-conedystrophy 9 11.1 32 53.1 52 42.3
Renaltractcalcification 33 6.1 1 0 38 5.3
Non-CFbronchiectasis 21 4.8 7 0 32 3.1CAKUT 25 8 2 0 29 6.9
Multipleendocrinetumours
21 0 0 NA 22 0
Severemulti-systematopy
4 0 12 0 21 0
181824May2018
Impactoffamilystructureonnumbersoftieredvariants
18
Autosomalrecessive Autosomaldominant1member 2members 3members 4members 1member 2members 3members 4members
Improvingannotationofcandidates
Exomiser:Prioritisation• Recovers65%ofthetier3diagnoses• Recovers57%oftheuntiered diagnoses
Tiering:Classification(filtering)• >90%precisionfortier1variants(someascertainmentbias)• >80%recallfortieredvariants(1+2+3)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Tier1+2 Exomisertop5 Tier1+2ANDExomisertop5Recall/Sensitivity Precision
VariantshighlightedbytieringandExomiser have>90sensitivityandvariouslevelsofspecificity
~0.2variants/case
~1.2variants/case
~285variants/case ~7variantspercase
Apipelinefordiagnostics
• Majorfocusisthevalidation,QAandaccreditationtoreadyoutplatformsforuseaspartoftheNHSEnglandGenomicMedicineServicefromOctober2018
• Thisinvolvesanalysisofsamplesalreadyheldaspartoftheprogramme
• Plussequencingofadditionalpositivecontrolsamples
• Existingpipelinecomponents• Newcomponents
• STRs• CNVsandSVs• LowlevelmtDNA variants
Validation,QAandaccreditation
●
●
●
●
●
●
●
●
●
●
●
●
●
LP2000254−DNA_D06
LP2000261−DNA_B10
LP2000266−DNA_G04
LP2000268−DNA_E01
LP2000269−DNA_A04
LP2000269−DNA_F06
LP2000711−DNA_G11
LP2000860−DNA_F08
LP2000860−DNA_F11
LP2000860−DNA_G03
LP2000860−DNA_G06
LP2000861−DNA_A03
LP2000861−DNA_H06
10 20 30Repeat size (repeat units)
Sam
ples
from
NH
NN
EH estimates
Experimental estimates
ATXN7 − SCA7 comparison
DiagnosticSTRresults
Disease Gene Expansionconsistentwithphenotype
Status
Fragile-Xsyndrome FMR1 2 1validated;1 inprogress
ALS/MND C9orf72 1 Validated
HuntingtonDisease HTT 3 2validated;1 inprogress
SCA6 CACNA1A 1 Validated
SCA12 PPP2R2B 1 Validated
DRPLA ATN1 1 Inprogress
Kennedydisease AR 1 Inprogress
21
18diagnosticlociexamined;first~5,000families
24May2018
• Collectionofupto180positivecontrolsinprocessacrossarangeofallelesizes
ComplementaryClinicalDatasets
Lifecoursedata:Secondarysources
NHSDigital• HospitalEpisodes• ONSdeathdetails• DiagnosticImaging• Patientrecordedoutcomes• Mentalhealth&intellectualdisability
PublicHealthEngland• Cancerregistry&datasets(COSD,SACT,RTDS,DID)
• Otherdiseaseregistries• Clinicalaudit• Screeningprogrammes
GPdata• Prescribing/dispensing• Reports/letters• Notes(freetext)
GeLgenomicresults
InterpretationValidationClinicalapplicationGermlineSomaticExitquestionnaire
GMCclinicaldataInterpretationdata:RD:HPOterms,PedigreeCa:Diagnosis&staging
Comprehensiveclinicaldata:DatamodelsforkeydataIncludinglabtestresultsEHRdatadumpTreatments&Investigations
GMCregistration
DemographicsConsentstatusAdditionalfindingsRegistrationSamplemanagement
2224May2018
NHSDsuccess:annualagreement,receivedquarterly,matched98.2%participantsAprilHESdelivery2.3mepisodeson31,781participants,increases400K/quarterDeathdatareceivedon430participants,otherdatasetsarrivingnow
FontcolourkeyReceivingdataDatarequestedPlannedsource
HESdata OutpatientepisodesA&EInpatientCriticalcare
=cancertreatment=non-cancertreatment
3rd MainProgrammedatarelease:April2018
24May2018
42,700genomesGenomes GMCClinicaldata
61,500participants380+datafields
Secondarydata
• HospitalEpisodeStatistics(HES)• DiagnosticImagingDataset(DID)• PatientReportedOutcomeMeasures(PROMs)
• MentalHealthServicesDataSet(MHSDS)• OfficeforNationalStatistics(ONS)–mortalitydataandcancerflagging
Tiering data
• Tier1,2and3variantsfrominterpretationpipeline
• FacilitateGeCIPinterpretationofProjectcases
Headline
tables
• KeyinformationfromdifferentLabKeytables,mergedandfilterable
• MergedwithQCdata• Willfacilitatecohort-buildingandprojectfeasibilityassessment
TheResearchEnvironmentataglance
2524May2018
Toolsandanalysis
GenomesstoredinbydatefoldersonIsilon share
Dataanddocumentation
Collaborationanddataflow
ClinicaldatastoredinLabKey
Confluencecontains:
VirtualdesktopinterfaceprovidesGUIandsecurity
Terminalallowscommandlinequeryingofthedata
RandRstudio allowsstatisticalanalysisofthedata
Firefoxbrowserallowsaccesstowhitelistedsites
Accesstomodulesandthesubmissionnodetorunlargescaleanalysis
• datareleasenotes• userguides• workaround
instructions
Domain-specificandsharedstorageforfiles
Socialmediaplatformforcommunication
Researchregistry–topromote
collaborationandenforcepublication
moratorium
GoingForward
26
TheResearchEnvironment
24May2018
GoingForward
27
TheResearchEnvironment
24May2018
GoingForward
28
TheResearchEnvironment
24May2018
2924May2018
ParticipantID
Sex EthnicGroup Participanttype
Germlinegenomebuild37
GermlineGenomebuild38
Tumourgenomebuild37
Tumourgenomebuild38
TieredData Disorder
0000001 M BlackorBlackBritish Proband PassedQC NA NA NA 1 Intellectualdisability0000002 M BlackorBlackBritish Mother PassedQC NA NA NA NA NA0000003 F BlackorBlackBritish Father PassedQC NA NA NA NA NA0000004 F WhiteBritish Proband NA PassedQC NA PassedQC NA Ductal0000005 M Other Proband NA PassedQC NA PassedQC NA Endometrioid adenocarcinoma
TheDiscoveryForumA driveroftranslationalresearch
3024May2018
ResearchEnvironment
Genomicdataset
Geno
micsE
ngland
Stakeh
olde
rs Academic
StakeholdersHealthserviceStakeholders
IndustryStakeholders
Businessvalue
Discovery Forum
• Exploring thebusinessvalueofgenomicmedicinedata.
• Connecting industrystakeholderstotheGenomicsEnglandcommunity.
• Providingagateway toourResearchEnvironmentanddataset.
• Leadingtodiscovery anddevelopmentofprecisionmethods,diagnostics,andtherapeutics.
Precisiondevelopm
ent
Thankyou!
Stayintouch
Follow‘GenomicsEngland’
www.genomicsengland.co.uk
@genomicsengland#genomes100k
Likethe‘GenomicsEngland’page
24/05/2018 32