A training and research lab for Nanopore sequencing
Data analysis for SME derived use cases
PoreLab
PoreLab Project Explore Nanopore sequencing needs within SME
Applications Genomics Metagenomics Transcriptomics
Data analysis Basecalling Quality Control Assembly Classification Transcriptomics Conclusions
2
pitchbyNIOZpitchbyDutchDNABiotech
pitchbyNSure
Nanopore Technology
3
How accurate is basecalling?
4 Comparison of Oxford Nanopore basecalling tools. Ryan R. Wick, Louise M. Judd and Kathryn E Holt. Doi. 10.5281/zendo.1188469
88%
Polishing without methylation awereness
Polishing with methylation awereness
99,7%
Readidentity Assembleidentity Rel.readlength
QC of data Sequencing process
Sequencing data Read distribution Q score Read Statistics (NanoPack)
5 NanoPack: visualizing and processing long-read sequencing data. De Coster W, D'Hert S, Schultz DT, Cruts M, Van Broeckhoven C. Bioinformatics. 2018 Aug 1;34(15):2666-2669. doi: 10.1093/bioinformatics/bty149.
Number of reads and fraction above quality cutoffs
Genomics Assembly Work Flow
6
DNA reads
Overlaps
Raw assembly
Mapping reads to draft assembly
Rawsignal
Optimal Assembly
Base Calling
Read to Read Finding
Assembly
Read Mapping
Polishing
Albacore/Guppy
Minimap2, OLC assembler
Minimap2, BWA-MEM
Miniasm, MECAT
RACON, Nanopolish
Senol Cali D. et.al. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions. Brief Bioinform. 2018
Genomics
Amount of contigs Total length N50 Q value
Pitch by Dutch DNA Biotech
7 Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies.
Bioinformatics. 2013 Apr 15;29(8):1072-5. doi: 10.1093/bioinformatics/btt086. Epub 2013 Feb 19.
• % GC • Coverage • % identity • % Genome fraction if reference is available
Shotgun MetaGenomics: Classification
8
Index of genomen Taxonomic relationship Genoom/ reads
Winclove,Soiltech,GroenAgro,BKD,pitchbyNIOZ
Data analysis transcriptomics
9
Raw Nanopore
reads
QC (NanoPlot)
PoreChop demultiplex
reads
Align against reference: minimap2
Transcripts quantification
Statistical analysis
edgeR/limma
Significant up/down regulation
Transcriptomics – Read mapping
10
Reads of low quality align to a lesser extent Filtering based on quality is not a good idea
Differential expression
11
21
Transcriptomics – Coverage plot
12
Transcript UN79269 komt ± 6x differentiele expressie hoger tot expressie in BC09 tov BC08
Conclusion
Basecalling is very accurate for consensus sequences, but less on a per read basis
Genomics data analysis of micro-organisms yields good
assemblies, detects transgenes in genome and could be
combined with Illumina data
Shotgun metagenomics enables accurate detection of
organisms but ratio’s are an approximation.
Transcriptomics is still at it’s infancy but will have a great future
after some adjustments in wet lab procedures
13
Acknowledgement: FG Ron Dirks Hans Jansen
Hogeschool Leiden Floyd Wittink Koen Bossers Nikola Petrusevski Stef Pieterman Jasper Oudenkerk
Consortiumpartners PoreLab
14
Jeroen Pijpe Brian Piepenbroek Casper Prins Stef Janson