Neutral variant
Unknown clinical significance
Pathogenic mutation Nienke van der StoepDept. Clinical GeneticsLUMC
Individual Variant InterpretationIn a diagnostic setting
UM CL Indication for request molecular analysis
• Genetic confirmation of clinical symptomsBetter treatment
• Carrier detectionBetter risk calculation and / or treatment options
• Prenatal analysisIn the Netherlands: >99% affected fetus the pregnancy is terminated
UM CL Genome Diagnostic setting
Analysis of genes proven to be the underlying cause of the clinical symptoms
Scanning* Approach: variant detection by DNA (or RNA) sequencing
- Methods: Sanger/ WES/ targeted NGS/ WGS)
* Approach: CNV detection, structural changes- methods: MLPA/ array analysis /karyotyping/ FISH/ WGS
Mutation/region specific* repeat length analysis* methylation* Sanger sequencing specific amplicons* deletion/duplication specific PCR ( or MLPA)* FISH* other
UM CL Sequence analysis; unsolicited findings
Whole Exome Sequencing (WES) analysis
What about findings that not correlate with the clinical phenotype present in the patient / family?
Precounseling is essential!
Patient / family wants to know:1. Only sequence changes correlated to the disease in the family2. Only sequence changes correlated to the disease in the family and other treatable disorders3. All sequence changes and the impact of the changes
Be aware of differential or missed diagnoses, where identified variants do correlate with phenotype of patient but not (yet) observed by medical doctor
UM CL Gene Panel Sequence Analysis
Analysis of genes proven to cause the clinical symptoms:* Using targeted sequence analysis approach either by specific enrichment or selected gene panel analysis.
Advantage: no unexpected results with respect to other diseases
Examples of genepanel NGS in the Netherlands:* Growth Disorders and Skeletal abnormalities (targeted)* Sporadic mental retardation (whole exome; sporadic cases)* Cardiomyopathy (targeted)* Muscular Dystrophies (targeted)* Deafness and blindness (whole exome and targeted analysis)
UM CL Position and Effect of nucleotide variants
(http://www.hgvs.org)
enhancerpromoter
ESS,ESE
Branchpoint sequenceISE, ISS
Exon Intron
UM CL Nomenclature of identified variants ;
Use HGVS
Exon: ESE?Nonsense c.3826C>T p.(Arg220*) ..Frameshift c.3525_3526delAA p.(Arg1157fs) ..In-frame c.4312_4314delGAA p.(Glu1438del) ..Missense c.4418A>G p.(Gln1494Arg) ..Silent c.3468C>T p.(=) ..Splice site c.3112G>T p.(=) ..
Intron:Splice site c.3113+1G>A p.?
NB. All c nomenclature is Reference Sequence specific
Start position 1: ATG translation codon: A is nucleotide number 1 Report information on the genome build (Hg19) and Reference Sequence
(http://www.hgvs.org)
NGS data: - Recommended to include the genomic coordinates (e.g. chr11: g.19207841)
- Indicate : confirmed by another independent method
UM CL RNA Splicing
DONOR5’ SPLICE SITE
ACCEPTOR3’ SPLICE SITE
EXON 2 EXON 3Intron2
EXON 1
ATG TGA
EXON 3
TGAATG
EXON 2EXON 1
BRANCH POINT (between: - 50 and -10)
PROTEIN
Genomic DNA
mRNA
Protein
UM CL RNA Splicing Consensus sequences
A(38)A(62) G(77)
C(31)
EXON INTRON
A(71)G(100) T(100)
G(24)
C(55)PY(84) PY(85) PY(58) X A(100) G(100)
T(37)
INTRON EXON
T(41)G(50)
A(24)
EXON 1
EXON 2
DONOR
ACCEPTOR
INTRON
INTRON
+1 +2
-2 -1
Zhang, Hum Mol Genet (1998) 7:919-932Roca et al, Genome research (2008) 18:77-87)
UM CL Interpretation of Detected Variants
Nonsense and Frameshift:- Almost always pathogenic
But be aware of- Nonsense mutation at the N-terminal end of the protein: alternative ATG translation codon usage possible?
- Nonsense mutation at the C-terminal end of the protein: eg. p.Lys3326* (BRCA2; 3418 aa): Is known neutral variant what about pathogenicity of all nonsense and frameshifts after this position?
EXON 2 EXON 3Intron2
Splice site changes:- Position -2, -1 and +1, +2 changes are almost always pathogenic UNLESS:
Skipped exon(s) are in frame and do not effect protein function. Other not-affected wt splice variant from same allele can replace function Wt RNA transcript of affected allele is still sufficiently expressed
Testing RNA expression of mutated allele highly recommended
UM CL Interpretation of Detected Variants
The rest: In-frame, missense, silent nucleotide changes and the other intron changes:
- How to decide pathogenic or a neutral variant? Variant of unknown significance (VUS)
VUS interpretation and classification tools: In silico evaluation of impact on RNA and protein function Gene specific Data bases Frequency in population (GoNL, ExAC, ESP, etc) Literature search Co-occurrence with deleterious in trans mutations Segregation with disease in families Biochemical functional tests
Expertise knowledge of the gene highly recommended for correct interpretation of variant.
UM CL Classification of sequence changes
5 Class system (Plon et al. Hum Mutat (2008) 29: 1282 – 1291)
Class Description Probability of being pathogenic5 Definitely pathogenic * > 0.994 Likely pathogenic# 0.95 - 0.993 Uncertain 0.05 - 0.9492 Likely not pathogenic or of little clinical significance 0.001 - 0.0491 Not pathogenic or of no clinical significance <0.001
*: prenatal and carrier detection is offered# : presymptomatic carrier testing is offered
3 Class UV system (Bell):
Description:[Definitely pathogenic]III. Possibly /likely to be pathogenic, but cannot be formally proven II. Unlikely to be pathogenic, but cannot be formally provenI. Not pathogenic or of no clinical significance
UM CL Tools & Criteria for Classification of Variants
Frequency: MAF eg. rs data, GoNL, ExAC, ESP > 1% (AD disorder) or > 5% (AR disorder) (more than 200 chromosomes analysed)
Class 1
Known deleterious: Nonsense, frameshift and ‘consensus sequences of intron’Class 5
In silico prediction programs: (protein and RNA splicing)Frequently used software Alamut:
- 5 in silico protein prediction programs- 5 in silico RNA splicing programs- links to various databases and literature
Functional studies: (eg. in vitro assay, RNA analysis, LOH studies in tumours)
Locus specific databases LOVD, BIC, HGMD professional, etc
Conservation
UM CL RNA in silico splice site predictionprograms
via Alamut
• Splice Site finder (SSF)
• Max End Scan (MES)
• NNSPLICE
• GeneSplicer
• Human Splicing Finder
BRCA study : Combining SSF and MES gave 96% sensitivity and 83% specificity for VUSs occurring in the vicinity of consensus splice sites
Correct False positive UncertainNot recognized
UM CL Protein in silico prediction programs
via AlamutAlign GVGD (http://agvgd.iarc.fr/agvgd_input.php)combines the biophysical characteristics of amino acids and protein multiple sequence alignments (C0 – C65)
PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2)impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations (benign – damaging)
SIFT (http://sift.jcvi.org/)is based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences (tolerated – damaging)
Mutation Taster (http://www.mutationtaster.org)the frequencies of all single features for known disease mutations/polymorphisms were studied in a large training set composed of >390,000 known disease mutations from HGMD Professional and >6,800,000 harmless SNPs and Indel polymorphisms from the 1000 Genomes Project (TGP) (benign – disease causing).
KD4v (http://decrypthon.igbmc.fr/kd4v/cgi-bin/home)The server provides a set of rules learned by Induction Logic Programming (ILP) on a set of missense variants described by conservation, physico-chemical, functional and 3D structure predicates. The rules are interpretable by non-expert humans and can be used to accurately predict the deleterious/neutral status of an unknown mutation
NB. All can give different Results depending on variant
UM CL In silico classification of Variants
Lindor et al. (2012) Hum Mut 33:8-21
Determine / estimate priori likelihood of causality
Eg. In silico analysis/Alamut tool
Additional facts
UM CL Classification of VUS
run prediction programs in Alamut (or other SW)For selected variants, (excluding class 1 and 5 variants) :
Silent change & intron changes outside consensus:• no effect on RNA splicing Class 2• effect on RNA splicing* Class 3 or >
Missense change and no further data (eg functional)• Effect on RNA splicing Class 3 or >• (4 out of 5 protein in silico neutral) Class 2• 3 out of 4 protein in silico neutral Class 2• Remaining Class 3
Further analysis can result in a reclassification (eg RNA studies,functional protein studies, LOH)
* Be aware of natural occurring isoforms/splice variants
UM CL Reporting pathogenic mutations
If identified variant is considered pathogenic mutation :
- Confirmation of the clinical diagnosis
- Genetic cause of the disease is identified
Consequence:
• Prenatal analysis offered
• Choice of treatment (eg breast cancer families)
• Presymptomatic testing is offered (eg. mastectomy / oophorectomy)
UM CL Stringent Selection variants in e.g. WES
Sanger sequencing (low # of genes):All sequence changes in the analyzed fragments are viewed and interpreted.
NGS/gene panel sequencing: Use custom / commercial designed pipeline, to create variant list.Often exclusion of:
• SNP: frequency > 1-2% (AD criteria not AR)• Silent changes• PolyPhen: benign• Etc…
Causal variant could be lost/removed in final VCF And false possible pathogenic variant could be identified
UM CL Classification of sequence changes using Alamut
Alamut: missense, c.1235A>T, (p.Glu412Val; TSC2) at Protein level
probably damagingpolymorphism
tuberin
Tuberous Sclerosis Complex 2
UM CL
Consensus sequence splice donor : A(62) G(77) / g(100) t(100) a(71)
mutant
normal
Alamut c.1235A>T (p.Glu412Val; TSC2), at RNA level , Splice site prediction
Classification of sequence changes using Alamut
NEW donor site
UM CL
Functional analysis:- RNA (isolated from skin fibroblasts) analysis performed :
Results:- RNA showed an abnormal pattern in agreement with the predictions.
NB.1 If possible: use an intragenic heterozygous SNP to rule out the possibility that the abnormal spliced RNA is a product of the normal allele.
NB.2 Use enough (about 5) controls to rule out leaky transcription artefacts
c.1235A>T p.(Glu412Val) in TSC2
NB. Software tool KD4v predicted polymorphism
Functional verification of sequence changes
Conclusion:Sequence change c.1235A>T is a pathogenic mutation; it influencesRNA splicing of TSC2 and therefore it is not a missense mutation buta splice site mutation
Nomenclature: c.1235A>T, p.Glu412fs
UM CL
TSC1: missense changes in same codon. Pathogenicity?
Functional analysis:- p.Arg190Cys: same as wildtype- p.Arg190Pro: pathogenic
p.Arg190Cys p.Arg190Pro
probably damaging probably damagingneutral deleterious
Classification of sequence changes using Alamut
UM CL Hereditary Breast/Ovarian Cancer
Disease/gene specific criteria
- 1:8 women develops breast cancer- ~5%: a genetic factor involved- 10-15% a pathogenic mutation in either BRCA1 or BRCA2- A lot of VUS identified
Classification tools/criteria• Co-occurrence of 2 deleterious mutations:
In BRCA1 not possible; BRCA2: other phenotype (Fanconi Anemia D1) • Co-segregation• Pathology: e.g. array CGH profiles; Loss of Heterozygosity of UV• Functional data
Problem:Most UVs are very rare and therefore the likelihood ratios will not give the ultimate result.
UM CL Hereditary Breast/Ovarian Cancer,
using function analysis
Variant: c.5309G>T p.Gly1770Val (BRCA1)
- Several small families; not enough for linkage analysis- All families of Northern African origin
arrayCGH of tumours: BRCA1 profileFunctional analysis:
Pathogenic
Possibly damagingDeleterious
Strong suspect for pathogenic variant
UM CL Bloopers
p.Met1628Val variant in BRCA1At first classified as neutral/low risk variant)
Phelan et al (2005) J. Med. Genet. 42: 138-146:functional test: pathogenic mutation
Carvalho and Monteiro (2007) J. Med. Genet. 44: 78:mistake in construct; not only 1628V variant, but also a deletion of 7 nucleotides.
p.Met1628Val is a neutral variant
UM CL Cryptic an challenging findings
with in silico variant prediction toolsResult Prenatal array
‘Patients’ : Mother and her previous deceased male fetus Detected : Array deletion Xp21: DMD gene in both mother and fetus
coordinates array data : 32687712 en 33058441; exact deletion unclear
mother is pregnant again
Check exact deletion by MLPA testing:
- deletion exon 2-9 of DMD gene
UM CL Cryptic an challenging findings
Previous data: Deletion also observed in other family, but also Array result. no index patient known, info: Adult brother of previous mother has same deletion (no phenotype?, no data)Action Test father and brothers of this pregnant woman
exon 1 exon 10
results in out of frame deletion (Alamut) pathogenic? Class 5
Result: Father has same deletion as pregnant mother and so far no clear dystrophy symptoms. Class 3
NB. Having index patient is very relevant for variant interpretation
UM CL
Functional RNA analysis:new donor site used variant results in ‘in frame’ insertion, Class 3(in silico prediction not that clear)
Be aware of newly generated splice sites
Cryptic an challenging findings
Gene: SDHA, splice variant in intron 9: c.1260+1G>A, p.?Disrupts canonical splice donor site : Class 5
UM CL Final considerations
Guidelines for finding genetic variants underlying human diseasePosted in ‘Genomes Unzipped’: 24 Apr 2014 06:00 AM PDTAuthors: Daniel MacArthur and Chris Gunter.
New DNA sequencing technologies are rapidly transforming the diagnosis of rare genetic diseases, but they also carry a risk: by allowing us to see all of the hundreds of “interesting-looking” variants in a patient’s genome, they make it potentially easy for researchers to spin a causal narrative around genetic changes that have nothing to do with disease status.
Such false positive reports can have serious consequences: incorrect diagnoses, unnecessary or ineffective treatment, and reproductive decisions (such as embryo termination) based on spurious test results.
In order to minimize such outcomes the field needs to decide on clear statistical*guidelines for deciding whether or not a variant is truly causally linked with disease.
* NB additional functional and biological related guidelines will be more reliable