Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 223 times |
Download: | 0 times |
1
Next Generation Sequencing
Itai SharonNovember 11th, 2009Introduction to Bioinformatics
2
2010: 5K$, a few days?
2009: Illumina, Helicos40-50K$
Sequencing the Human Genome
Year
Log
10(p
rice)
201020052000
10
8
6
4
22012: 100$, <24 hrs?
2008: ABI SOLiD60K$, 2 weeks
2007: 4541M$, 3 months
2001: Celera100M$, 3 years
2001: Human Genome Project2.7G$, 11 years
3
In this Talk:
• Sequencing 1.0: Sanger• Assembly• Next generation sequencing (NGS)• NGS applications• Future directions
Genome Sequencing
• Goal figuring the order of nucleotides across a genome
• Problem Current DNA sequencing methods can handle only
short stretches of DNA at once (<1-2Kbp)
• Solution Sequence and then use computers to assemble the
small pieces
4
Genome Sequencing
55
ACGTGGTAA CGTATACAC TAGGCCATA GTAATGGCG CACCCTTAG TGGCGTATA CATA…
ACGTGGTAATGGCGTATACACCCTTAGGCCATA
Short fragments of DNA
AC..GCTT..TC
CG..CA
AC..GC
TG..GT TC..CC
GA..GCTG..AC
CT..TGGT..GC AC..GC AC..GC
AT..ATTT..CC
AA..GC
Short DNA sequences
ACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACCTCT...
Sequenced genome
Genome
Sanger Sequencing
6
Sanger Sequencing
7
Sanger Sequencing
• Advantages Long reads (~900bps) Suitable for small projects
• Disadvantages Low throughput Expensive
8
Assembly
9
9
Cut DNA to larger pieces (2Kbp, 15Kbp) and sequence both ends of each piece (Fleischmann et al., 1994)
contig 1 contig 215Kbp mates
2Kbp mates
~(length―1,000)
~500 bp ~500 bp
resolving repeats
Better assembly of contigs, gap lengths estimation
many pieces to assemble
High coverage:
Assembly: How Much DNA?
10
Low coverage:
A few pieces to assemble
a few contigs, a few gaps
many contigs, many gaps
Input OutputLander and Waterman,
1988
Sanger Sequencing
11
1980 1990 2000
1982: lambda virusDNA stretches up to 30-40Kbp (Sanger et al.)
1994: H. Influenzae1.8 Mbp (Fleischmann et al.)
2001: H. Sapiens, D. Melanogaster3 Gbp (Venter et al.)
2007: Global Ocean Sampling Expedition~3,000 organisms, 7Gbp (Venter et al.)
12
Next Generation Sequencing: Why Now?
13
High Parallelism is Achieved in Polony Sequencing
PolonySanger
14
Generation of Polony array: DNA Beads (454, SOLiD)
DNA Beads are generated using Emulsion PCR
15
Generation of Polony array: DNA Beads (454, SOLiD)
DNA Beads are placed in wells
16
Generation of Polony array: Bridge-PCR (Solexa)
DNA fragments are attached to array and used as PCR templates
17
Sequencing: Pyrosequencing (454)
Complementary strand elongation: DNA Polymerase
18
Sequencing: Fluorescently labeled Nucleotides (Solexa)
Complementary strand elongation: DNA Polymerase
19
Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)
Complementary strand elongation: DNA Ligase
20
Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)
5 reading frames, each position is read twice
21
Single Molecule Sequencing: HeliScope
22
Technology Summary
Read length Sequencing Technology
Throughput (per run)
Cost (1mbp)*
Sanger ~800bp Sanger 400kbp 500$
454 ~400bp Polony 500Mbp 60$
Solexa 75bp Polony 20Gbp 2$
SOLiD 75bp Polony 60Gbp 2$
Helicos 30-35bp Single molecule
25Gbp 1$
*Source: Shendure & Ji, Nat Biotech, 2008
23
What, When and Why
• Sanger:Small projects (less than 1Mbp)
• 454:De-novo sequencing, metagenomics
• Solexa, SOLiD, Heliscope:– Gene expression, protein-DNA interactions– Resequencing
24
Applications
25
Applications
26
Where Do We Go from Here?
• Higher throughput, longer reads (Pacific BioSciences)
• Computational bottleneck• Shift to sequencing-based technologies• Will it help to cure cancer?