Date post: | 29-Jan-2016 |
Category: |
Documents |
Upload: | toby-houston |
View: | 216 times |
Download: | 0 times |
Yuki Juan2003.5.5
Automatic DNA and Genome Sequencing
Genetic Mapping
Automated DNA Sequencing
Principle of Sanger SequencingHigh-Throughput SequencingReading Sequence TracesContig AssemblyEmerging Sequencing Methods
http://www.mun.ca/biology/scarr/4241chaptertwo/Biology4241chaptertwo/
Chapter2GenomeSequencingandAnnotation.htm#sanger
The First Cycle in PCR
The Second Cycle in PCR
The Third Cycle in PCR
Basic chain terination method developed in 1974 by Frederick Sanger
The Principle of Dideoxy (Sanger) Sequencing
Strategy of the Chain-termination Method for Sequencing DNA
Strategy of the Chain-termination Method for Sequencing DNA
Fluorescence Detection
High-Throughput Sequencing
The new techniques and equipment include:
Four-color fluorescent dyes have replaced the radioactive label.Automatical trace readingImprovement in the chemistry of template purification and the sequencing reaction.Capillary electrophresis
Automated Sequencing Method
Automated Sequencing Method
ABI PRISM® 3700 DNA Sequencer
ABI PRISM® 3700 DNA Sequencer
Price: $65,50A fully automated, multi-capillary electrophoresis instrument designedAutomatically analyze multiple runs of 96 samples
MegaBACE 1000 DNA Sequencer
MegaBACE 1000 DNA Sequencer
An automated machine capable of high-throughput DNA analysis, processing 96 samples in just a few short hours.Applications :
DNA sequencinggenotyping fragment analysis.
6 arrays of 16 capillaries with an interior diameter of about 100 µm.The system uses high-pressure nitrogen gas to inject the capillaries with Linear Polyacrylamide, a denaturing gel.
Base-callingUsing automated softwarePhred program developed at the University of Washington.
Reading Sequence Traces
Phred Programuse algorithms to convert trace files into base sequences and assign quality values to
each base call in the sequence
The Phred Base-calling Algorithm
SNP: Single nucleotide polymorphism
Automated Sequence Chromatograms
Phred Quality Value Distributions
Dark blue: Bases 100-400 in each sequenceLight blue: All basesThe predicted error rate increases for longer fragments
Contig: A contiguous (touching; adjoining) stretch of cloned DNAThe finishing step in sequencing a multi-clone stretch of DNA, and involves alignment, editing, and error correction. Sequence editing software( from the University of Washington)
phrap assemblerconsed graphic editor
Contig Assembly
Phrap assemblerhttp://www.mrc-lmb.cam.ac.uk/pubseq/manual/gap4_unix_90.ht
ml
An aligned reads window in Consed
Alignment algorithms
The Needleman-Wunsch method (1970) was the first computationally feasible algorithm for sequence alignment.
Alignments based on these algorithms may vary due to differences in the weighting of their default parameters.
weighting of the effects of indels relative to single base mismatchesweighting attached to quality scores of bases from contributing sequencweighting attached to frequency of mismatches
Sequencing by Hybridization (SBH) Mass Spectrophotometric Sequences Direct Visualization of Single DNA Molecules by Atomic Force Microscopy (AFM) Single Molecule Sequencing Techniques Single Nucleotide Cutting
Emerging Sequencing Methods
Uses the complementarity of the two strands of DNA molecules to determine if a match to an oligonucleotides is present in the DNA.Possible for short sequences
Sequencing by Hybridization (SBH)
fragmented oligonucleotides can be identified by time of flight through a vacuum chamberUseful for fragmented DNA molecules under 50 bases longLikely possible to determine full sequence of molecule divided into all possible oligonucleotidesMethods fast, and should become cheap
Mass Spectrophotometric Sequences
Can observe bumps in ssDNA, but not resolve basesPossibly hybridize molecule to Oligonucleotides with bulky modified side groups
Direct Visualization of Single DNA Molecules by AFM
Extremely fast and relatively cheapCan accommodate long DNA fragmentsNanopore sequencing
Single Molecule Sequencing Technique
Single-molecule Nanopore Sequencing
Protein pore channel in electrically polarized membraneSingle DNA molecule pulled through by electrophoresesNucleotides transiently block ion movement, resulting in drop in current resolutioIf slowed to about 1 base per millisecond, could sequence 1Kb per second, three orders of magnitude faster than capillary sequencers
Nanopore Sequencing
Can suspend long strand of DNA in a vacuum by molecular tweezersExonuclease molecule cuts off single nucleotides to be read by fluorescent signal or imprinting on grid
Single Nucleotide Cutting
Hierarchical SequencingShotgun SequencingSequence Verification
Genome Sequencing
Hierarchical versus Shotgun Sequencing
Hierarchical versus Shotgun Sequencing
Both processes involve fragmenting the genome and aligning fragments due to overlapping sequences.Both aim for 5-10x redundancy in sequence representation.Main difference is that hierarchical sequencing attempts to align large cloned fragments (~100kb) into a tiling path.shotgun sequencing omits this step. The entire genome is fragmented into small pieces which are then aligned using computer algorithms.Hierarchical sequencing was the basis of the publicly funded Human Genome Project.Shotgun sequencing was the basis of the privately funded Human Genome Project
Also known as top down, map based, or clone by clone sequencing Steps involved:
Shear DNA into manageable units (50 - 200 kb)
* This is accomplished by sonication* Amplification (PCR)* Clone into vector of choice (BAC'S usually)Create DNA library
* aim for 5-10x redundancySelection of a tiling path
Hierarchical Sequencing
Cloning Vectors Using in Genome Sequencing
Hierarchical Assembly of a Sequence-contig Scaffold
The Tiling Path
Cab be assembled using a combination of three methods
HybridizationFingerprintingEnd-sequencing
Create probes for specific sequencesOften uses robots to replicate plate clones that show probe hybridizationThe genome can be probed for many different sequences, leading to islands of overlapping clones that will be joined later in the process.Chromosome walking - use the end sequence of a clone to create a probe for an adjacent clone.
Hybridization
Use restriction digest profile to determine sequence overlapDone by complex computer algorithms
Fingerprinting
Alignment BAC clones by Hybridization and Fingerprinting
End-sequencing
Sequence the end of BAC clonesCreate a probe for that end sequence, and hope that it hybridizes near the middle of another clone
3 steps: Filtering
Removal of contaminating fragmentsThey may be bacterial in origin, or clones that show evidence of recombination.
Assembling the Layout generating and ordering each BAC contigPosition of each contig can be confirmed by alignment with previously characterized Sequence Tagged Sites (STS)
Merging Aligning BAC contigs that are known to be adjacent to each other
Assembly of The Draft Genome
Computer algorithms are used to assemble contigs from thousands of overlapping sequences
Shotgun Sequencing
ScreenerOverlapperUnitiggerScaffolder
Tasks performed by Computational Algorithms
Masks (marks & hide) sequences that contain repetitive DNA. e.g. Microsatellites, ALU repeats, ribosomal DNAThese sequences are not taken into consideration when determining overlap
Screener
Compares every unscreened read against every other unscreened readIs essentially the same as performing a BLAST searchSearches for overlap of a predetermined length (40 bp for Human Genome Project)
Overlapper
Blast Output
A local Alignment
Unitigger
Unitig: a contig formed from a series of overlapping unambigously unique sequences
U-unitigs and Repeat Resolution
Scaffolder
Uses mate-pair information to link U-unitigs into scaffold contigsMost of the remaining gaps at this point are due to repeat elements, and can be resolved by the following method:
Unitigs that were not classified as U-unitigs are placed in the gaps.
These are often referred to as overcollapsed unitigsIf their placement is supported by two or more mate-pairs, it is referred to as a ROCK.If their placement is supported by one mate-pair, it is referred to as a STONE.Small gaps can be filled in by chromosome walking
Assembly of a Mapped Scaffold
Proportion of Fly and Human Genomes in Large Scaffolds
CompletenessAccuracyValidity of assembly
Sequence Verification
Alignment of Two Draft Human Genome Assemblies