Ken McGrath - Next Gen Sequencing - Game of Thrones edition

Post on 10-May-2015

4,805 views 3 download

Tags:

description

Title: Next‐generation sequencing: an overview of technologies and applications Presenter: Dr Ken McGrath, Australian Genome Research Facility Abstract: The “Next‐Generation Sequencing” landscape is one of constant change, with new and emerging technologies constantly competing with established platforms. This abundance of competition is resulting in faster and cheaper methods to perform sequencing of DNA and RNA samples, but it also brings with it a confusing array of options, each with its own strengths and weaknesses. Ken gives an overview of the available sequencing technologies and runs through some example projects that can be run on them, as well as describing the typical bioinformatics approaches for these projects, and also take a look at what’s “next” in Next‐Gen. First presented at the 2014 Winter School in Mathematical and Computational Biology http://bioinformatics.org.au/ws14/program/

transcript

Next-Generation Sequencing: an overview of technologies and applications

July 2014

Ken McGrathAustralian Genome Research Facility

Next-Gen Sequencing Edition

• Current rulers of the “throne”

• Sequencing by synthesis

• Each cycle extends and reads a single base

• Reads of up to 2x300bp

DNA(0.1-1.0 ug)

Sample preparation Cluster growth

5’

5’3’

G

T

C

A

G

T

C

A

G

T

C

A

C

A

G

TC

A

T

C

A

C

C

TAG

CG

TA

GT

1 2 3 7 8 94 5 6

Image acquisition Base calling

T G C T A C G A T …

Sequencing

Illumina Sequencing TechnologyRobust Reversible Terminator Chemistry Foundation

MiSeq

Illumina

NextSeq500HiSeq2500

Illumina X Ten

ILLUMINA SEQUENCING SYSTEMS

•150 bp paired end reads ~120Gbp / run (~1 day)

NextSeq500•15

0 bp paired end reads ~ 180 Gbp/ run (2 days)

Illumina HiSeq 2500 Rapid SBS

•125 bp paired end reads ~ 1000 Gbp/ run (6 day)

Illumina HiSeq 2500 v4 SBS

•300 bp paired end reads ~15 Gb/run (2.3 days)

MiSeq v3

• 150bp paired end reads ~1800 Gb/run (3 days)HiSeq X Ten

ILLUMINA SEQUENCING SYSTEMS

•10 -15 million pass filter clusters per run

MiSeq v2•50

bp single reads (0.5 – 0.75 Gb/run)

~6hrs

•≥ 90% bases higher than Q30 at 50 bp

50 cycles

•150 bp paired end reads (3.0 – 4.5 Gb/run)

~24 hrs

•≥ 80% bases higher than Q30 at 2x150 bp

300 cycles

•2x250 bp paired end reads (5.0 - 7.5 Gb/run)

~40 hrs

•≥75% bases higher than Q30 at 2 x 250 bp

500 cycles

•20-25 million pass filter clusters per run

MiSeq v3

•2x 75 bp paired end reads (3.0 – 2.5 Gb/run)

~20 hrs

•≥ 85% bases higher than Q30 at 2 x 75 bp

150 cycles

•2x300 bp paired end reads (12.0 – 15.0 Gb/run)

~55 hrs

•≥ 70 % bases higher than Q30 at 2 x 300 bp

600 cycles

Illumina Summary Strengths Weaknesses

Lots of data Too much data

Low error rates Slower run times

Great choice of platform sizes Shorter reads

Paired-end reads

Pretty awesome Slept with brother

• Competing with illumina for market share

• Two technologies (sequencing by ligation, and semiconductor sequencing)

• Reads of up to 400bp

Ion Torrent

• Ion Semiconductor Sequencing

• Detection of hydrogen ions during the polymerization DNA

• Sequencing occurs in microwells with ion (pH) sensors

– No modified nucleotides

– No optics

Ion Torrent• DNA Ions Sequence

– Nucleotides flow sequentially over Ion semiconductor chip

– One sensor per well per sequencing reaction

– Direct detection of natural DNA extension– Millions of sequencing reactions per chip– Fast cycle time, real time detection

Sensor Plate

Silicon SubstrateDrain SourceBulk

dNTP

To column receiver

∆ pH

∆ Q

∆ V

Sensing Layer

H+

SOLiD

Life Technologies

Ion Torrent PGM Ion Torrent Proton

• 100 bp reads ~20 Gbp/run (Coming soon!)

Ion Torrent Chips

• 200bp and 400bp reads, 30-100Mb/run (1.5 hrs)314 Chip

• 200bp and 400bp reads, 300-1000 Mbp / run (2 hrs)316 Chip

• 200bp and 400bp reads, 600Mb-2Gbp / run (4.5 hrs)318 Chip

• 200 bp reads, 5-10 Gbp/run P1 Chip

P2 Chip

PG

MP

RO

TO

N

Life Technologies Summary Strengths Weaknesses

Fast run times Lower maximum data output

Scalable data outputs Read quality can vary

Longer reads (400bp)

Pretty Haven’t done much recently

• Current rulers of the throne

• Sequencing by synthesis

• Each cycle extends and reads a single base

• Reads up to 2x300bp

• Current rulers of the throne

• Sequencing by synthesis

• Each cycle extends and reads a single base

• Reads up to 2x300bp

• One of the first NGS platforms

• Pyrosequencing based

• Each cycle allows extension of a single base (A, C, G or T)

• Reads up to 800bp

454 Pyrosequencing

454 Pyrosequencing

454: Data Processing

Image Processing

Base-calling

Quality Filtering

SFF File

T Base Flow

A Base Flow

C Base Flow

G Base Flow

Raw Image Files

GS-FLX

Roche

FLX Jr

GS-FLX

Roche

FLX Jr

Roche

• Not over yet…

Stratos Genomics Genia Something else?

Roche Summary Strengths Weaknesses

Long reads (up to 800bp) High $ per base

Older technology

Platform soon unavailable

Had wolves Pretty much dead

• Competing with illumina for market share

• Two technologies (sequencing by ligation, and semiconductor sequencing)

• Reads of up to 400bp

• Competing with illumina for market share

• Two technologies (sequencing by ligation, and semiconductor sequencing)

• Reads of up to 400bp

• Single-molecule real-time sequencing (SMRT)

• Detection of individual bases as they extend (by light emission)

• Long Reads (up to 4x2.5kb)

PacBio

PacBio

• Higher error rates (~90%)

• Compensate by “looping” DNA to create multiple passes

PacBio

Zero-Mode Waveguides (ZMW)

PacBio Summary Strengths Weaknesses

Long reads (4x2.5kb) High $ per base

Single-molecule detection Higher error rate

Capable of Epigenetics Still to prove itself

Freakin’ Dragons! Keeps losing dragons

Oxford Nanopore

• Direct detection of individual bases as pass through a “nanopore”

• MinION and GridION

• No synthesis/extension

• Capable of VERY Long Reads (>100kb)

Oxford Nanopore Summary Strengths Weaknesses

Extra-Long reads (>100kb) Not yet available (alpha testing)

Single-molecule detection Very high error rates

Capable of Epigenetics Immature platform

Very cost effective

Exotic and powerful Steal babies

NGS Applications

• Whole genome sequencing (today)» De novo assembly» Structural variant detection» Comparative genomics

• RNAseq (later today)» Gene expression» Splice variants» Transcriptomics » MicroRNA

• Epigenomics (tomorrow)» Indirect (bisulphite)» Direct

• Targeted sequencing (Wed)» Hybrid capture» Amplicon resequencing

Data Quality

Read Length

Yield/Coverage

Hodor! (Thank You)