+ All Categories
Home > Documents > Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort...

Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort...

Date post: 16-Jan-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
66
Mark Routbort, MD, PhD On behalf of the Informatics Subdivision of AMP 1 Hands-on Workshop Variant Interpretation and Classification
Transcript
Page 1: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Mark Routbort, MD, PhD On behalf of the Informatics Subdivision of AMP

1

Hands-on Workshop Variant Interpretation and

Classification

Page 2: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Workshop Scope

•Is it real? Believe it •What is it? Name/describe it •What does it mean? Understand it

2

Questions for any sequence variant

Page 3: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Workshop Examples

•Positional noise & thresholding artifacts •Degenerate alignments •Missed calls •Phasing issues •Complex mutations

3

Page 4: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Outline • Viewing variants

• Overview of IGV/genomics viewers • Review of basic bioinformatics file types

• Reference sequences/genomes • Sequence alignment maps • Annotation tracks

• Clinical NGS examples • Brief summary of classification/knowledge

4

Page 5: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Viewing variants

•All of the examples used in this workshop will be available online

•Viewing them and working yourself requires IGV

5

Page 6: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

IGV – Integrative Genomics Viewer • Open source software from the Broad Institute • Highly capable and well maintained genomic viewer

• The Java based version is the most full-featured (there is also a Javascript based version which can be embedded in web pages directly and viewed on nearly all browsers)

• Latest version downloads https://software.broadinstitute.org/software/igv/download

• Help/feature chat group https://groups.google.com/forum/#!forum/igv-help 6

Page 7: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

The Website

•http://variantworkshop.org •Hosts the sequence files we will use today •Has a simple set of links to session files that will load the examples into a running instance of IGV

7

Page 8: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Generic anatomy of a genome browser

Tracks representing different features (annotations, assay regions, etc) or characteristics of samples (sequence, coverage depth, variant calls, etc)

Genomic position(s)

Page 9: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Start up IGV, display a test message

• First, start up IGV. IGV must be running for URL based session files to load. Once IGV is running, click on this link:

• Load test message for IGV • If all is well, you will see a little message about AMP in the

alignment reads shown • Color our reads

• Right click on alignment sequence and choose “Color alignments by – Read Strand”

9

Page 10: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

10

Anatomy of IGV

Reference genome

Page 11: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

11

Anatomy of IGV

Contig (chromosome)

Page 12: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

12

Anatomy of IGV

Navigation entry

Page 13: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

13

Anatomy of IGV

Contig axis/map

Page 14: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

14

Anatomy of IGV

Sequence alignments track

Sequence coverage track

Page 15: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

15

Anatomy of IGV

Reference sequence

Page 16: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Reference sequence • A consensus, baseline, wild-type, or comparator DNA sequence • Used as the comparator to define ‘what has changed’ –

sequence variants, structural rearrangements • Reference alleles – REF • Alternative alleles/variants – ALT

16

Page 17: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Reference genome • A set of reference sequences for an organism or species under study

• Divided into ‘contigs’ – pieces of DNA that are part of the same physical package

• ‘Shotgun’ sequencing – if sequences have overlaps, they are part of the same contig • In eukaryotes, contigs more or less chromosomes

• Human contigs (major): chr1 – chr22, chrX, chrY, chrM • Interesting point – since a reference genome is generally a

consensus created by sequencing many individual members of a species, it is very likely that no individual member actually has a perfect consensus sequence

• We all have variants

17

Page 18: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

IGV and the reference genome • Everything starts with the reference genome • Most clinical labs are still using hg19/GRCh37 • Latest build is actually GRCh38 • For most clinical sequence this doesn’t make much difference

because clinical labs are generally reporting at the gene & protein level (c. & p. in HGVS), not the genome level

• BRAF c.1799T>A (p.V600E) is the nomenclature of the common BRAF mutation in melanoma, PTC, HCL regardless of the reference genome

18

Page 19: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

IGV and the reference genome • A reference genome at the minimum includes sequence data for

each of the contigs • NNNNNNNNNNNNNNATCGCGCGCGTAGCTGANNNNNNNNNNNN • What are the N’s?

• Optionally, reference genomes (GENOME files) may include • gene level projections/transcript mappings • a cytomap/karyotype projection

• We will start with the simplest possible (artificial) reference genome

19

Page 20: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

The simplest reference genome

• FASTA file • FASTA is a simple, old text format for representing DNA

sequences • Can include multiple sequences (in the setting of a reference

genome, multiple contigs) • Sample FASTA genome

• How many contigs are there? • What are the contig names? • How many base pairs are in each contig?

20

Page 21: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Adding a sequence alignment track • SAM – sequence alignment map file (.SAM)

• Uncompressed human readable format • BAM – binary (sequence) alignment map file (.BAM)

• Compressed non-readable format • Both can be indexed (.SAI and .BAI)

• Index is always binary and is fraction of the size of the .SAM/.BAM • Contains information about where alignments that map to a particular

position are location in the sequence alignment map

21

Page 22: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Adding a sequence alignment track • Sample sequence alignment map (SAM) • This is about the simplest possible SAM file; the full

specification allows for a lot more detail about reads

22

Read name Contig CIGAR

Mapping quality

Alignment start Read (sequence)

Page 23: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Adding a VCF track • Sample VCF (variant call format file) • This is about the simplest possible VCF, with one entry

23

Page 24: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Putting the simple example together

24

• Load simple example - an artificial genome and sequence alignment

• This is a link to a “session” file which tells IGV to

• Load our reference genome • Load our sequence alignment map • Display VCF file as an annotation track

• How many different sequences are shown?

• This is a trick question

• One sequence, three alignments • Can we prove this?

Page 25: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Review - Ambiguity/degeneracy in alignment

25

Reference

Alt (sample)

GGGCATCATCATGGG

GGGCATCATGGG

Ref GGGCATCATCATGGG Alt GGG---CATCATGGG

Alignment 1

Ref GGGCATCATCATGGG Alt GGGCAT---CATGGG

Alignment 2

Ref GGGCATCATCATGGG Alt GGGCATCAT---GGG

Alignment 3

One CAT got deleted – which one?

Page 26: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Describing sequence variants

• VCF (variant call format) uses genomic coordinates and is not normative:

• HGVS (Human Genome Variation Society) nomenclature is normative: the only acceptable representation is

Chr3 1111333333 A T Chr3 1111333333 AT TT Chr3 1111333332 CA CT

Synonymous and equally acceptable VCF representation of the same variant:

g.1111333333A>T

Page 27: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

HGVS • Standardized nomenclature to promote portability, enduring meaning, and accuracy • Human Genome Variation Society (HGVS): http://varnomen.hgvs.org

HGNC Gene Symbol

Gene reference sequence (Genbank or EMBL accession + version #

Valine to Glutamate at codon 600

Thymidine to Adenosine at mRNA position 1799

“c” oding DNA sequence

“p” rotein impact (predicted)

NM_004333.4 (BRAF): c. 1799T>A (p. V600E)

BRAF mutation analysis: Mutation detected in codon 600, exon 15 (GTG to GAG) of the BRAF gene that would change the encoded amino acid from Valine to Glutamate (p.Val600Glu)

Page 28: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of colorectal (CRC) adenocarcinoma (and lots of parameters to adjust)

• 50-gene amplicon hotspot NGS panel • Tumor only – no germline control • 4 called variants post-filtering • Load colorectal adenocarcinoma example

28

Page 29: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of colorectal (CRC) adenocarcinoma

• When germline is not available for comparison, comparison to other samples on the same assay can be highly useful

• Background noise • Platform specific artifacts • Contamination

29

Page 30: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter and display play

• IGV has NUMEROUS parameters that affect the display of alignments, highlighting of the coverage track

• Also many right-context menu display & sorting options • “Top Few” that tend to be useful/important in routine clinical review

and where default settings are very problematic for somatic NGS • We’ll do this with the current CRC case

30

Page 31: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter play - downsampling • Downsampling improves performance – by loading less data • View → Preferences → Alignments → Downsample reads • Turn OFF downsampling (uncheck the box)

• NB: IGV defaults to downsampling turned on, to 100 reads. This is not appropriate for most labs doing high-depth somatic sequencing.

• If you are confused or not able to see variant in the reads, check to see if you are downsampling!

31

Page 32: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter play – highlighting variants • View → Preferences → Alignments → Coverage track options → Coverage

allele-fraction threshold • IGV will color positions in the coverage track that have a variant allele >

coverage allele-fraction threshold • Indels and insertions are not variant alleles/don’t color

• This can be helpful in drawing attention to variants or just noisy positions • The threshold may need to be tweaked to the sensitivity and error correction of

the library & sequencing modality • UMID/consensus calling/liquid biopsy may warrant lower thresholds

• NB: IGV defaults to 0.2 here (20%) – this is much too high for somatic laboratories/assays

• This assay has a nominal sensitivity of about 5% - we will set the highlight to be about half of this – say 0.02

32

Page 33: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter play – soft clipping • View → Preferences → Alignments → Show soft-clipped bases • Soft clipped bases are bases that are present in the read sequence but

not considered part of the alignment by the upstream bioinformatics pipeline

• These may be adapter sequences, barcodes, or inadvertently clipped patient sequence depending

• Some pipelines ‘hard-clip’ – simple remove unaligned bases. This is poor practice.

• For pipelines that soft-clip, it can be incredibly helpful in some settings to show the soft-clipped bases, as they may provide hidden support for sequence variants – we will have an example of this soon

33

Page 34: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter play – show center line • Preferences → Alignments → Show center line • If on, shows the exact center position, very useful for getting absolute nucleotide

positions/c. position • May want to turn off for figures • Let’s make sure it’s on and calculate the c. of the APC mutation

• Navigate to APC mutation only (cannot move track sideways or zoom in/out in multi-locus view) • To calculate c. position:

• Ensure alignment is correct (3’ alignment) • What codon is it (be careful of multiple reference transcripts)? • Is it a positive strand (left to right) or negative strand (right to left) gene – how can you tell?

• Zoom out until you see the intronic arrows • c.1 is the A of the initiating methionine that starts every eukaryotic translation • So c position = [Codon #] * 3 – correction factor, where correction factor = 1 for the second nucleotide of

the codon and correction factor = 2 for the first nucleotide of the codon. • Using NM_000038 as the canonical transcript for APC, what is the c. number?

34

Page 35: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Parameter play – feature flanking region

• If you use multi-locus view, this may be important

• Preferences → General • Amount of flanking sequence added

to regions in multi-locus view • For targeted clinical sequencing,

want this number to be small, or won’t be able to see details (too zoomed out)

35

Page 36: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Display play

• Right click on alignments – color reads • Right click on alignments – sort reads

• Sorting by base can be HIGHLY useful • Very low frequency events – e.g. liquid biopsy • To establish / interrogate phase relationships

36

Page 37: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Collapsing Sidetrack • IGV can show 3 visual modes for tracks

• Collapsed (smallest) • Squished • Expanded (largest)

• What these modes do depend a little on the kind of track: • Sequence tracks never actually overlap even if collapsed • Annotation tracks may overlap/consolidate when collapsed

• Hovering over an annotation track may indicate it is collapsed (more than one annotation is seen)

• Right-clicking on a track will show a context menu that give the visual mode and allows it to be changed

• This is particularly important for genes with more than one reference transcript

• Does APC have more than one reference transcript? How many?

37

Page 38: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

The importance of alignment • Load lung adenocarcinoma example #1 • How many mutations?

• HGVS says if sequence variants are separated by at least one base pair of wild-type sequence, to express them individually

• E.g. [c.4_6del;8C>T] – a 3 bp deletion slightly separated from a SNV • HGVS also states that all alignments need to be as 3’ as possible

(‘right-aligned’) • Whether this is right or left on the IGV screen depends on whether the gene is on

the plus strand (‘genomic sense’ or ‘positive’) or negative strand (‘genomic antisense’ or ‘negative’)

• What strandedness is EGFR? (sense, positive) • Is the deletion 3’ aligned? (no)

38

Page 39: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

The importance of being aligned

Page 40: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Alignment can be more than misleading • 50-gene amplicon hotspot NGS panel • Tumor only – no germline control • 80% tumor cellularity • Nearly all melanomas have mutations on our panel • No mutations were called • Manually reviewed NRAS, KIT, BRAF amplicons • Load melanoma with no mutation calls example

40

Page 41: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of melanoma

• Is there a BRAF V600E mutation? • Why was this not called? • Why is that phenomenon so extreme – could there be an

explanation? • And a setting we could tweak to prove our hypothesis?

41

Page 42: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

The importance of alignment

Page 43: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Missing reads?

Page 44: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Showing soft-clipped bases . . .

Page 45: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of melanoma Conclusions: A novel BRAF mutation is present NM_004333.4(BRAF):c.[1799T>A;1802_1813del] p.V600_W604delinsE Almost missed because a deletion near the amplicon ends resulted in suboptimal alignment and soft-clipping of reverse reads We can infer driver functionality (but not responsiveness to inhibition)

45

Page 46: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Phasing – a second case of melanoma • Tumor & germline paired sequence • Load melanoma with three mutation calls example • 3 mutation calls in BRAF (highly pipeline dependent)

46

Page 47: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Phasing is challenging • In cis – same allele/same read, always. One mutation/one protein. • In trans – distinct allele/different read, always. Two mutations/two proteins. • Subclonal – one allele subsumes the other – mutations are in cis when both are

present, but reads containing only the truncal mutation are also present. Two mutations/two proteins, but different than in trans.

• Optimally, pipelines give options to select from

47

Page 48: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Phasing can be helpful/informative • Phasing with (either in cis or fully in trans) a known germline

polymorphism strongly supports germline origin • Subclonal phasing with a known germline polymorphism strongly

supports somatic origin • Too many alleles? (>2 for autosomal loci, possibly >1 for chrX/Y in

men) • At least one is somatic!

48

Page 49: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Phasing can be helpful/informative • 52 with lung adenocarcinoma • Treatment naïve • Load lung adenocarcinoma example #3 • EGFR L858R is a known common driver missense mutation • EGFR T790M is a known common treatment induced resistance

mutation • EGFR c.2361G>A is a very common silent germline polymorphism

49

Page 50: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Phasing can be helpful/informative

• Conclusion – this is a germline EGFR T790M mutation • (Probably) pathogenic germline mutation associated in the literature

with hereditary lung cancer

50

Page 51: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Sequence variants can be complex

51

• 47 y/o with lung adenocarcinoma, several mutations called in EGFR exon 19

• Load lung adenocarcinoma example #2

Page 52: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

How many mutations do you see here?

Four: 18 bp deletion and 3 SNVs

Three: 18 bp deletion, SNV, and a dinucleotide mutation

Two: 18 bp deletion + non-frameshifting 5 bp delins

One: 26 bp complex non-frameshifting delins

Page 53: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Reasonable HGVS representations: NM_005228.3(EGFR):c.[2240_2257del;2261A>G;2264_2265delinsAT] p.L747_P755delinsSRD (nominal) NM_005228.3(EGFR): c.2240_2265delinsCGAGAGAT p.L747_A755delinsSRD

Page 54: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Thresholding: Danger

54

VAF

Sample #

0

1

2

3

4

5

6

7

8

9

10

0 20 40 60 80 100 120 140 160 180 200

Page 55: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Thresholding: Danger

55

VAF

Sample #

0

1

2

3

4

5

6

7

8

9

10

0 20 40 60 80 100 120 140 160 180 200

Page 56: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Thesholding

• VAF or variant count based thresholding is common in variant calling pipelines and is a common source of outlier based false positive calls which may appear plausible in isolation from the underlying noise

• VAFs may be outliers due to limited DNA quality • Variant counts may be outliers due to CNVs

• Protections: • Have a positional error model for thresholding • Review multiple samples at a novel lower-quality positional call

56

Page 57: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Understanding positional noise to avoid false positive outliers (sample comparison)

• Liquid biopsy platform using UMIDs and high levels of noise reduction to sequence ccfDNA – nominal sensitivity about 0.2%

• Several TP53 mutation calls made • Load positional noise example

57

Page 58: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of myeloproliferative disorder • 81-gene leukemia/MDS/MPD panel run • JAK2, MPL were wild-type • Several called variants in CALR

58

Page 59: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of myeloproliferative disorder

• Load myeloproliferative/myelofibrosis example • Do you see the variants? Real? Pathogenic? • Sometimes you need to copy the reads and mock things up in a

text editor to be sure. Let’s try it!

59

Page 60: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of myeloproliferative disorder

• Short in-frame deletions

• Often well-tolerated (similar to missense)

• Prone to occur in triplet repeat areas

• Can be relatively common germline polymorphisms

60

Page 61: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

A case of myeloproliferative disorder

• The apparent dinucleotide subheterozygous variant is an alignment artifact that is allele specific to the polymorphic allele

• Always question if multiple alignments around indels might be the same sequence, whether somatic or germline

61

Page 62: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Extreme trans complexity, two examples

• Most commonly seen in the context of circulating DNA, whether heme malignancy or liquid biopsy

• Case 1: Refractory acute myeloid leukemia patient • 4 distinct NRAS mutations called • Load AML example

62

Page 63: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Extreme trans complexity, two examples • Case 2: Patient with BRAF V600E positive metastatic

cholangiocarcinoma • Had received dabrafenib/trametnib with initial good response followed by

progression • Liquid biopsy shows truncal BRAF V600E plus at least 6 low level

mutations in KRAS & NRAS • Load liquid biopsy example

63

Page 64: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Variant classification – high level and only my opinions • ExAC – best source for population polymorphisms

• http://exac.broad.org • Allows VCF level URLs, e.g. http://exac.broadinstitute.org/variant/22-

46615880-T-C • ClinVar – best source for pathogenicity

• https://www.ncbi.nlm.nih.gov/clinvar/ • COSMIC –most comprehensive literature based summary

• https://cancer.sanger.ac.uk/cosmic • Some caution regarding germline inclusion is needed

64

Page 65: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Variant classification – what I know • Occasionally, phasing with known germline polymorphisms or the

presence of ‘too many alleles’ can be used to definitively ascribe somatic origin even without germline comparison

• Does not mean pathogenic … • For germline mutations,

• Frameshift mutations (indels not divisible by 3) and nonsense point mutations can generally be inferred to be deleterious (rare exceptions for mutations at the extreme carboxyl end of the protein)

• For somatic mutations, • Deleterious mutations in tumor suppressor genes can generally be inferred to

be oncogenic • That’s about it!

65

Page 66: Hands-on Workshop Variant Interpretation and Classificationvariantworkshop.org/igv/AMP 2019 Routbort - Variant Interpretation and Classification.pdfStart up IGV, display a test message

Summary • Trust but verify your pipeline, alignments, and annotations • Review positive calls – they may be more complex than called • Review pertinent negatives • Use Occam’s razor when approach complex indels • Strand bias is useful as a discriminator but does not rule out a

true mutation. Knowing about soft-clipping can save you • Be wary of phasing but know that it can also help you • Be wary of thesholding artifacts and calling an outlier in the

noise as something real

66


Recommended