USE OF MINION SEQUENCING FOR REAL-TIME AMR
DETECTION AND BACTERIAL TYPING
Senior scientist
HENRIK HASMAN
Statens Serum Institut (SSI)
Denmark
SEQUENCING OUTPUT – MINION VS MISEQ
Time (hours)
MiSeq v2 (2*250 bp)
MiSeq v3 (2*300 bp) Raw reads (Y)
Assembly (Y + 1 h)
MiSeq v2: Y = 40 h)
MiSeq v3: Y = 60 h)
Micro/Nano: Y = 17 h)
Raw reads (Real time)
Rapid assembly (X + 1 h)
Corrected assembly (X + 24 h)
X = 3-18 h50x
100x
MiSeq v2 (Micro/nano 2*150 bp)
MINION WORKFLOW – 8 HOURS TURNAROUND
Real-time bacterial typing
Real-time genotypic AMR detection
High-precision plasmid assembly
• SNP calling
• MLST?
• Serotyping
• Species verification (rMLST?)
• AMR gene groups AB classes
• AMR mutations AB classes
• Species intrinsic resistance profilling?
• Rough plasmid assembly
• Re-basecalling target reads (+ methylation)
• Re-assembly
GRAPHICAL USER INTERFACE (GUI)
Genus
AB classes detected
AB classes not detected
Genome coverage
# of circular molecules
Closely related isolates in database
Run SNP analysis
Extract plasmids
Last change (minutes)
Species
Clonal cluster
GRAPHICAL USER INTERFACE (GUI)
Genus
AB classes detected
AB classes not detected
Genome coverage
# of circular molecules
Closely related isolates in database
Run SNP analysis
Extract plasmids
Last change (minutes)
SpeciesKlebsiella n.a.
Beta-lactams (PEN)
Beta-lactams (CEF), Beta-lactams (CARBA), Amino-
glycosides, Flouroquinolones, Phenicols…….
Strain23
2,4 x
1
0
Clonal cluster CC23
GRAPHICAL USER INTERFACE (GUI)
Genus
AB classes detected
AB classes not detected
Genome coverage
# of circular molecules
Closely related isolates in database
Run SNP analysis
Extract plasmids
Last change (minutes)
SpeciesKlebsiella varicula
Beta-lactams (PEN), Beta-lactams (CEF), Flouroquinolones
Beta-lactams (CARBA), Aminoglycosides, Phenicols.
Strain23, Strain43, Strain142, Strain155
31 x
17
6
Clonal cluster CC23
MINION WORKFLOW – 8 HOURS TURNAROUND
Real-time bacterial typing
Real-time genotypic AMR detection
High-precision plasmid assembly
• SNP calling
• MLST?
• Serotyping
• Species verification (rMLST?)
• AMR gene groups AB classes
• AMR mutations AB classes
• Species intrinsic resistance profilling?
• Rough plasmid assembly
• Re-basecalling target reads (+ methylation)
• Re-assembly
AMR DETECTION – SMALL EXERCISE
DATA (pre-uploaded to CGE to save time)
1. Complete genome of AMA 1167 including the two AMR plasmids
2. Draft MinION genome of AMA 1167
3. Raw MinION reads (q8+ only) of AMA 1167
Your task is to asses if you think MinION data (raw or assembled) can be used
to
A) Predict phenotypes
B) Identify specific resistance genes (for surveillance purposes)
AMR DETECTION – BACKGROUND INFO
Methods (CGE)
• ResFinder (Complete genome vs MinION draft genome)
• KMA (Complete genome, MinION draft genome, MinION raw data)
Test strain
AMR DETECTION
ILLUMINA VS MINION DATA
Repeat area (rRNA, IS, homologue genes ect..)
1
2
3
RESFINDER COMPARISON
ResFinder 3.2 (https://cge.cbs.dtu.dk/services/ResFinder/)
Complete genome AMA 1167 (1)
https://cge.cbs.dtu.dk//cgi-
bin/webface.fcgi?jobid=5D8B012B000014A377E446C1
Draft MinION genome of AMA 1167 (2)
https://cge.cbs.dtu.dk/cgi-
bin/webface.fcgi?jobid=5D8B0306000019EF2DF7B1CE
Note:• The complete genome is perceived as the optimal data dataset. However, acquiring this
requires extensive sequencing with properly both short reads (Illumina) and long reads
(PacBio or MinION) + hand curation of “trouble” areas.
• The draft MinION genome has been polished with Racon twice, but will still suffer from
issues regarding homo-polymer’s and methylation in MinION data.
• Only (1) can identify the gyrA point mutations related to fluoroquinolone resistance
(activate “ “ on the links above to compare).
KMA COMPARISON
KMA 1.3 (https://cge.cbs.dtu.dk/services/KMA/)
Complete genome AMA 1167 (1)
https://cge.cbs.dtu.dk//cgi-
bin/webface.fcgi?jobid=5D8B1368000048A51A203627
Draft MinION genome of AMA 1167 (2)
https://cge.cbs.dtu.dk//cgi-
bin/webface.fcgi?jobid=5D8B138E000048F3EA63E221
Raw MinION reads (q8+ only) of AMA 1167 (3)
https://cge.cbs.dtu.dk/cgi-
bin/webface.fcgi?jobid=5D8B1300000047580EB71980
Note: (Compare, which genes are found and evaluate TemplateID scores)• The complete genome (1) is basically showing the same result in KMA as in ResFinder.
• The draft MinION genome (2) basically agrees with (1), but with lower identity scores (97-
99%). Note blaCMY-2 has changed to blaCMY-149.
• Raw MinION reads (3) shows good TemplateID values (99+%), but also identifies many
sub-variants. Here, the “Depth“ values can be used to identify most likely sub-variant..but
may still fail
RESFINDER – COMPLETE GENOME (1)
RESFINDER – DRAFT NANOPORE GENOME (2)
KMA – COMPLETE GENOME (1)
KMA – MINION DRAFT GENOME (2)
KMA OUTPUT – RAW MINION DATA (3)
KMA OUTPUT – RAW MINION DATA (3; SORTED)
RESFINDER 4.0 - MINION DRAFT GENOME (2)
SEQUENCING OUTPUT – MINION VS MISEQ
Raw reads (Real time)
Rapid assembly (X + 1 h)
Corrected assembly (X + 24 h)
X = 3-18 h50x
100x
100x 4x
90x 75x 50x 40x 25x 10x 5x
3x
2x
1x
’RAW’ RESFINDER HITS
Already at average
coverage of 10x almost all
AMR genes have been
found, however there is a
lot of noise.
Pink: correct hit truth!
orange: missing hit false negative
yellow: extra hits false positive
’GENE-GROUP’ RESFINDER HITS
Already at ‘cov_3’ almost all AMR genes have been found. At ‘cov_10’ there is a complete and
correct profile with no noise at all. So sul1 and sul2 ends in same group….but also blaOXA-1
and blaOXA-181, which has impact on the phenotypic predictions based on genotypes.
Simple grouping of genes by the three first
letters in the name
pink: correct hit truth!
orange: missing hit false negative
yellow: extra hits false positive
PHENOTYPE RESFINDER HITS
Gene hits converted to AMR phenotypes using the
ResFinder 4.0 database as guide.
Same conclusion:
At ‘cov_10’ there is a complete and correct profile with no noise at all.
pink: correct hit truth!
orange: missing hit false negative
yellow: extra hits false positive
PHENOTYPE RESFINDER HITS
Gene hits converted to AMR classes using the
ResFinder 4.0 database as guide.
Same conclusion:
At ‘cov_10’ there is a complete and correct profile with no noise at all.
pink: correct hit truth!
orange: missing hit false negative
yellow: extra hits false positive
MINION WORKFLOW – 8 HOURS TURNAROUND
Real-time bacterial typing
Real-time genotypic AMR detection
High-precision plasmid assembly
• SNP calling
• MLST?
• Serotyping
• Species verification (rMLST?)
• AMR gene groups AB classes
• AMR mutations AB classes
• Species intrinsic resistance profilling?
• Rough plasmid assembly
• Re-basecalling target reads (+ methylation)
• Re-assembly
SNP CALLING USING MINION DATA
Challenges
• Raw (basecalled) data from MinION has 5-10% errors.
• Assembled data still have 1-3% errors efter polishing (50-130 kbp’s)
• Standard mappers handle long reads poorly).
• Standard SNP-callers are trained/designed for Illumina short reads
with low error rate.
Possible solution
• Utilizing KMA to align MinION reads and build consensus sequences
• Use modified NDtree (or CSIphylogeny) to call relevant SNPs
• Build SNP calling (web) tool based on KMA and NDtree
ST410 CPE (E. COLI) OUTBREAK
Background
• Carbapenemase producing ST410 is an “International clone”.
• A clonal variant (CT587) has been introduced in Denmark to cause a
major outbreak (30+ patients)
Test data
• Raw Illumina data
• Raw MinION data
• Complete genome + plasmids of Danish index strain (AMA 1167).
Strain collection (N = 24)
A. 6 ST410 CT587 isolates (“Outbreak”)
B. 6 ST410 non-CT587 isolates (“Outliers”; not part of outbreak)
C. 12 non-ST410 isolates (background)
CORE GENOME MLST (CGMLST)
ST410
TEST STRAINS
ST410
Non-ST410
Outbreak-ST410
SNP TOOL – KMA-BASED (NDTREE V2.0)
KMA chooses reference
• AMA1167 identified as best reference (index isolate of outbreak)
• Future version allow own reference to be used
Ndtree v2.0 performs SNP filtering based on QC scores
• Different error settings depending on Illumina or MinION data
• Possible to prune closely positioned SNPs
Visualization through FigTree
• Output as pdf and Newick files
• KMA performs alignment of reads
CSI PHYLOGENY AS BENCHMARK
6 + 6 ST410 – CSI PHYLOGENY (ILLUMINA)
CPO20180039_S58_R1_001.sorted/1-2342
CPO20150034_S5_L001_R1_001.sorted/1-2342
CPO20150054_S3_L001_R1_001.sorted/1-2342
CPO20150014_S12_L001_R1_001.sorted/1-2342
CPO20160077_minION_S78_L555_R1_001.sorted/1-2342
CPO20160003_S1_L001_R1_001.sorted/1-2342
CPO20180100_S79_L555_R1_001.sorted/1-2342
CPO20180105_S54_L555_R1_001.sorted/1-2342
CPO20180119_S35_L555_R1_001.sorted/1-2342
CPO20180108_S36_L555_R1_001.sorted/1-2342
CPO20150011_S9_L001_R1_001.sorted/1-2342
CPO20170014_S11_L555_R1_001.sorted/1-2342
0.2
CT587
CT596
CT611
CT512
CT278
6 + 6 ST410 – CSI PHYLOGENY (ILLUMINA)
NDTREE V1.2 V2.0 (KMA BASED)
2.0
6 + 6 ST410 NDTREE V2.0 (ILLUMINA)
CT587
CT596
CT611CT512
CT278
CT527CT523
6 + 6 ST410 NDTREE V2.0 (MINION)
CT587
CT596
CT611CT512
CT278
CT527CT523
6 + 6 ST410 NDTREE V2.0 (ALL)
CT587
CT596
CT611
CT512
CT278
CT527
CT523
6 + 6 ST410 NDTREE V2.0 ILLUMINA + MINION
Illumina data is always 23 SNPs from MinION data.
All these positions relates to dcm methylation sites and can therefore be filtered
away in the analysis…or solved by using the new version of Guppy basecaller.
6 + 6 ST410 NANOPORE – OTHER REFERENCE*
*NZ_CP031653.1_Ecoli_UK_Dog_Liverpool
6 + 6 ST410 NANOPORE – DRAFT GENOME
6 + 6 ST410 MINION – NANOPORE REFERENCE
MINION WORKFLOW – 8 HOURS TURNAROUND
Real-time bacterial typing
Real-time genotypic AMR detection
High-precision plasmid assembly
• SNP calling
• MLST?
• Serotyping
• Species verification (rMLST?)
• AMR gene groups AB classes
• AMR mutations AB classes
• Species intrinsic resistance profilling?
• Rough plasmid assembly
• Re-basecalling target reads (+ methylation)
• Re-assembly
FULL FORECE - EJP AMR PROJECT 2020
FULL FORECE - EJP AMR PROJECT 2020