+ All Categories
Home > Documents > Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of...

Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of...

Date post: 19-Jan-2016
Category:
Upload: zain-axe
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center for Genome Sciences Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill
Transcript
Page 1: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Current Sequencing Technologies and Data Generation

Corbin Jones & Piotr MieczkowskiDepartment of Biology, College of Arts and Sciences, Carolina Center for Genome Sciences Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill

Page 2: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Library prep Sequencing

Page 3: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Sample Submission

Sonication QC LIMS

End RepairAdenylation

Adapter Ligation

Size Selection

PCR QC

15 nM dilution

Pooling for multiplexing

Facility

Sample flowTransfer to new plate/tube

Data flow

3 µl of Sample

3 µl of Sample

Sample failed y/nConcentration

Size

Leftover sample Dilution

Leftover sample

SAMPLE

Sample flowSame plate

Page 4: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Manual and Semiautomation in HTSF library prep workflow

Sonication

Sage – Pippin – Automated size selection system

Automated library size selection

Magnetic beads DNA size selection

Page 5: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Automation in HTSF

Tecan – Freedom Evo system – 8 tip

-2x48 (96) samples per week – DNA library prep-Automated sample normalization steps-PCR and qPCR preparation-Reagans distribution-Can be adapted to small and medium scale protocols for Illumina and Ion Torrent-We have all necessary components for DNA/RNA extraction using Qiagen kits

Caliper – Sciclone system96 tip pipetting head(8-96 samples per run)

-TruSeq DNA library preparation-TruSeq Exome Enrichment-SureSelect Agilent DNA capture and library preparation

Page 6: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

NEXT-GENERATION SEQUENCING (DEEP SEQUENCING) PLATFORMSo Short reads

1. Genome Analyzer IIx (GAIIx), HiSeq2000, HiSeq2500, MiSeq – Illumina

2. SOLiD 5500xl System – Applied Biosystem3. HeliScope™ Single Molecule Sequencer - Helicos

o Long reads1. Genome Sequencer FLX System (454) – Roche2. PacBio RS - Pacific Bioscience 3. Personal Genome Machine, Ion Proton - Ion Torrent4. GridION – Oxford Nanopore

o Mapping sequences to large DNA fragments• NABsys • Bionanomatrix

Page 7: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

UNC – HTSF• 9 HiSeq 2000/2500• 1 GA II• PacBio• Ion Torrent• MiSeq (Jeff Dangl)

Liz Buda and Donghui Tan

Also on campus:454 (Microbiome)454 jr. (Viral genomics)MiSeq – Kevin Weeks

Page 8: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

What type of sequencing should I choose for the Illumina sequencing project?

HiSeq 2000/2500 – 100-160mln single end sequencing reads per lane.

- ChIPseq – Single End 50 cycles (2-3 human samples per lane) - RNAseq – Single End 50 cycles (2-3 human samples per lane)If you are interested in splicing variants and fusion genes both Single End 100cycles and Paired End 2x50cycles will be better option for you.

-Whole Genome Sequencing – Paired End 2x100cycles (2-3 lanes per genome)-Exome Capture - Paired End 2x100cycles (4 samples per lane)

MiSeq – 3-7 mln single end sequencing reads per lane. Custom projects , fast turnaround. Metagenomics - 16S profile – Paired End 2x150cycles up to 24 samples per lane.-Whole Microbial Genome Sequencing - Paired End 2x150cycles

Page 9: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

HiSeq 2000

SHORT READ PLATFORMS at UNC

Initially capable of up to 600Gb per run in 13 days.

Cost of resequencing one human genome: Now UNC PI - (30x coverage) about $6,000Now for outside of UNC - (30x coverage) about $9,000

HiSeq 2500

Initially capable of up 100Gb per run in 27hours. Cost per genome - ???

Page 10: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

MiSeq

- Small capacity system. PE 2x150cycles in 27hours.

- PE 2 x 250bp coming soon – error rate for read 1 – less than 1%; read 2 about 1.2%.

- In preparation – PE 2 x 400bp – error rate for read1 about 2%; read 2 about 4%.

- In preparation – Longer insert size possible 1.5kb

Page 11: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

PacBio RS Single molecule resolution in real time• Short waiting time for result and simple

workflow– Generate basecalls in <1 day– Polymerase speed ≥1 base per second

• No amplification required– Bias not introduced– More uniform coverage

• Direct observation – Distinguish heterogeneous samples– Simultaneous kinetic measurements

• Long reads– Identify repeats and structural variants– Less coverage required

• Information content– One assay, multiple applications

• Genetic variation (SVs to SNPs)• Methylation• Enzymology

C2 chemistry – installed March 2012-Long reads 6-10kb-Meidan size of molecules 3kb-Still 15% error rate-No strobe sequencing

Software focus on:De novo assemblyHi quality CCS consensus reads

In preparation

-Load long molecules by magnetic beads-Modified nucleotides detection

Page 12: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Sam

ple

Pre

par

atio

n

LS – long sequencing reads

• Large insert sizes (2kb-10kb)• Generates one pass on each molecule sequenced

• Small insert sizes 500bp• Generates multiple passes on each molecule

sequenced

Standard

Circular Consensus

CCS – high quality sequencing reads

PacBio RS – two sequencing modes

Page 13: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Example Data: 1 smart cell

Pre-Filter # of Bases 180,320,136 bp Post-Filter # of Bases 165,424,592 bpPre-Filter # of Reads 75153 Post-Filter # of Reads 52801Pre-Filter Mean Readlength 2399 bp Post-Filter Mean Readlength 3133 bpPre-Filter Mean Read Quality 0.624 Post-Filter Mean Read Quality 0.827% Adapter Dimer (0-10bp) 1.94 %% Short Insert (11-100bp) 0.47 %

Page 14: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Personal Genome Machine – Ion Torrent (life technologies)

Three types of semiconductor chips:314 – 20Mb316 - 200Mb318 – 1Gb

Read length depends on base composition 200-250bp (200cycles)System is enabled for Paired End 2x100cyclesThe fastest sequencing system on the market.Recommendation:Resequencing applications which require fast turnaround of samples

- Amplicons (PCR products)-Small and medium size genomes-Custom DNA capture applications

How it works:

H+ ion is released during base incorporation. Individual polymerases attached to beads are positioned in tiny wells that rest on a tiny pH meter.

How it works:

H+ ion is released during base incorporation. Individual polymerases attached to beads are positioned in tiny wells that rest on a tiny pH meter.

Page 15: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

PGM/Ion Torrent Data 316 chip

Thr.

Total Number of Bases [Mbp] 77.65 ‣ Number of Q17 Bases [Mbp] 36.11 ‣ Number of Q20 Bases [Mbp] 27.33

Total Number of Reads 368,860Mean Length [bp] 211Longest Read [bp] 380

Page 16: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Library Preparation from Low Quantities of DNA or RNA

Mondrian SP System – NuGEN Technologies

- Human libraries from 5ng of total DNA. Only 10-15% of duplicate reads.

- Ultralow DNA library systemsSoon:- Ultralow RNA library systems- Libraries from total RNA with

rRNA depletion.

Advanced Liquid Logic from RTP

Microfluidics stationary and portable systems

Page 17: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Emerging Sequencing Technologies

Semiconductor sequencing chip

Nanopore / Nanochannel sequencing

Page 18: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Ion Proton System

- Human genome in one day- Cost of reagents $1000 per run- Error rate around 1.2%- Human Genome, RNAseq, ChIPseq

Ion Proton Chip I – 10Gb(Whole Exome capture experiments)

Ion Proton Chip II – 100GbWhole human Genome resequencing

Page 19: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Oxford Nanopore – new view on sequencing

Hemolysin – pore - inner diameter of 1nm, about 100,000 times smaller than that of a human hair.

Page 20: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Oxford Nanopore

DNA sequencingError rate 4%, prediction for end of the year 0.1 – 2%.

Page 21: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Nanopore array

Page 22: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Oxford Nanopore – new concepts

MinION

- 150Mb per run- Tested 48kb read length-$900 per instrument-500 pores per device

GridION

- XXXMb per run- Tested 48kb read length-$XXX per instrument-2000 pores per device, soon 8000 pores-Cost per human genome $1500.

Page 23: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Oxford Nanopore – applications

- DNA sequencing- Protein detection- Protein DNA interaction- Small molecule detection

- 96 well plates for 96 samples

- Controlled time of sequencing

Page 24: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Intelligent BioSystems Mini20 System(manufactured by Azco Biotech)

• Amplification by rolony method• Sequencing by Synthesis with announced 100 base

reads, but expect to compete with Sanger down the road• Designed for clinical labs• 20 independent flow cells, no queue for loading, run

asynchronously• 20M reads/flow cell, 4 GB/ flow cell• Potential problems with repeats• System cost $120K, $150 flow cell (disposable), full costs

per sample not clear yet.• Entering early access now, expect commercial shipping

late 2012

Page 25: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

Genia Technologies

• Very early stage announcement – Backed by Life Technologies (at least 1 year away)

• Describe system as a cross between Ion Torrent and Oxford Nanopore

• Electronic “Active Control” technology enables highly efficient nanopore-membrane assembly and control of DNA movement through the channel

• Initially used α-Hemolysin and claimed 98% raw accuracy with that but now are using an undisclosed pore for further development.

• Claim sensitivity 1-2 orders of magnitude greater than Oxford Nanopore.

• Ramping up pore density to 100K pores/chip by end of 2012.• Plan to market a mobile reader for <$1K and per sample costs <$100• Plan early access in late 2012, commercial shipment 2013

Page 26: Current Sequencing Technologies and Data Generation Corbin Jones & Piotr Mieczkowski Department of Biology, College of Arts and Sciences, Carolina Center.

“caveat emptor!”


Recommended