Marker typesPotato Association of America
Frederiction August 9, 2009
Allen Van Deynze
Use of DNA Markers in Breeding
Fingerprinting of germplasmArrangement of diversity (clustering,
PCA, etc.)
Parental Selection
Quality Assurance
Marker Based Parent SimilarityMarker based estimated variance within a populationGenetic distance between parents
Parent-offspring tests, Genetic purity tests, Event tests
Germplasm Analysis
BreedingAlternative or support to selection for traitsIncrease rate of genetic gain:
Selection during off-season cyclesSelection of hybrid traits on inbred individualsEarly selection (e.g. pre-flowering)
Trait AnalysisAssociation of traits with genomic regions Understanding trait relationships (linkage vs. pleiotropy)Understanding causes of variation (aid in gene cloning)Marker Assisted BreedingMarker Assisted Backcrossing
Marker assisted selection
Fruit ripening
DNA marker
The # of Markers Needed Depends on Goals
• Protect varieties: 100s of markers • Classify germplasm: 100s mapped• ID tightly linked QTLs in linkage studies - 100s mapped• ID candidate genes and association studies - saturated map.
• Depends on number of chromosomes • Depends on size of genetic map (cM)
DNA ⇒ RNA ⇒ Protein ⇒ Trait
Image compliments of National Human Genome Research Institute
The “Central Dogma” of molecular biology is that the information in the DNA sequence is transcribed into mRNA, which is then translated into proteins.
Proteins are large molecules that are the enzymes and structural components of living cells = trait
Marker types
• RFLPs• RAPDs• AFLPs• SSRs • SNPs • SFPs• Others
Restriction Fragment Length Polymorphism (RFLPs)
cDNA clonesGenomic clones
RFLPs
☺Co-dominant☺Detect all alleles simultaneously☺Good across related species☺Basis (anchors) of many species maps
Too costly and labor intensive for breeding
Random Amplified Polymorphic DNA (RAPDs)
University of Saskatchewan
RAPDs
☺No sequence information needed☺Universal primer set
Reproducibility problems
Amplified Fragment Length Polymorphism
genomic DNA
Restriction enzyme digestion
Adaptor ligation
Selective PCR amplification
AFLP fingerprint
AFLP® characteristics
☺ multiplex PCR☺ “Competition PCR”: quantitative detection☺ No sequence information required☺ Size-based fragment discrimination☺ Transcript and marker discovery ☺ Transcript and marker detection
Universal technology (proprietary)
Inter MITE Polymorphism (IMP), interSSR, Inter RGA
Marker types…
Amplifies DNA between MITEs (miniature inverted-repeat transposable elements)
MITEs are well distributed throughout most genomes
Template DNA
MITEs
Each end of the MITE is characterized by an inverted
repeat sequenceTerminal inverted
repeats
Inter MITE DNA is amplified by PCR
PCR Amplification
The numerous polymorphic bands create a distinct fingerprint for each line
‘Inter’ markers
☺High multiplexing value15 to 75 loci per reactionHigh throughputCost-effective
☺Distributed throughout the genome☺Good level of intra-species variation☺High level of cross applicability
Dominant markersMay not be in coding regions
Tomato
tcactttgcagtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtcccgttcagtcactttgcagtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgtcccgttcag
Simple Sequence Repeat (microsatellites)
PCR
Simple Sequence Repeats
☺Medium abundance☺Medium throughput☺Available in many crops
Need sequence informationMay or may not be associated with genes
Single-nucleotide polymorphisms (SNPs)
cgtgtactgacctgcatgctatgaatcagtacatcgactagcttcgtgtactgacctgcatgctaggaatcagtacatcgactagctt
☺Highly abundant• roughly 1 per 100-2000 base pairs
☺Distributed throughout genome including genes
☺Genetically stable☺Typically biallelic☺Can be scored as a +/- marker☺Mutation may be diagnostic
SNPs
Limited information per locusNeed sequence information
Single Feature Polymorphisms
A
Genotype 1Genotype 2
BC
DE
F
GH
IJ
KL
MN
A B C D E F G H I J K
SFP
L M N
Pro
be In
tens
ity
SFPs
☺Based on SNPs and Insertion/deletions• Abundant
☺Distributed throughout genome including genes
☺Genetically stable☺Highly multiplexable
DominantNeed sequence information
Diversity array technology
DARTs
☺Medium throughput☺Multiplex
Dominant markersSemi-Fixed assays
UseSNPs
Why move to SNP maps?
Microsatellite markers create maps with large gaps- appropriate for within family studies
SNPs
SNPs create dense maps to pinpoint regions across the population
Marker Detection
• Hybridization• Amplification
• Electrophoresis• Fluorescence
Polymerase Chain Reaction
Taken from the National Health Museum gallery
SNP technologies
Hybridization
Single base pair extension
Allele-specific PCR
Agarose Gel Electrophoresis☺Easy☺Universal
ExpensiveLow throughput
UseRAPDsSSRsSNPs• RFLPs• AFLPs
Automated Gel Electrophoresis
☺Easy☺High resolution☺Automated☺High throughput
Expensive equip
UseSSRsAFLPsSNPsIMPs
Real Time PCR
Real-Time PCR cont’d
☺Easy☺Automated☺High throughput
Expensive equip
UseSNPs
Fluidigm
96 samples x 96 assays
Pyrosequencing☺Automated☺Medium throughput
Expensive equip
UseSNPs
Invader® Assay for SNP DetectionBiplex FRET Format
A
T
C
G
A
A
C
C
F1 Q F2 Q
F1 F2
Invader® OligoWT Probe Mut Probe
FRET Cassette 1 FRET Cassette 2
Released 5´ Flap
CleavageSite
Target Target
CleavageSite
Invader® Oligo
Released 5´ Flap
CleavageSite
CleavageSite
Invader
☺Automated☺High throughput☺Highly sensitive☺Flexible☺Quantitative
Minimum amount reagents required
UseSNPs
Mass Spec
Mass Spec
☺Medium throughput☺Multiplex☺Inexpensive reagents☺Automated
Need amplificationExpensive equipment
Melting Curve Analysis
homos
het
“Liquid” Arrays
☺Automated☺High throughput☺Highly sensitive☺Multiplex☺Flexible
Expensive equip
UseSNPs
Illumina
2-60,000 SNPs x 96 samples $<0.01-0.15/dp
Experimental Procedure
SNP technologiesTechnology Samples SNPs Cost/SNPAgarose Gels 10‐384 10‐384 highPolyacrylamide Gels 10‐384 10‐384 highReal Time PCR 96‐1,500 1‐100 lowFluidigm 12‐15,000 12‐96 low‐very lowInvader 96‐1000s 96+ very lowPyrosequencing 96‐384 100s medMass Spec 96‐384 100s medMelting curve 10‐384 100s medIllumina Bead Express 480 1‐384 medIllumina Golden Gate 480 384‐1536 lowIllumina Infinium 1152 7,600‐100k very low
Marker Attributes
Marker RFLPs RAPDs SSRsAFLPs/ IMPs SFPs SNPs
Development costs high low high low high med-highTechnical complexity high low low med med lowAutomated no no med med semi yesReproducibility high low high med med highCross species yes no yes no yes noSegregation co-dom dom co-dom dom dom co-dom
Information contentgenomic/
gene none genomic nonegenomic/ genes
genomic/ genes
Cost/datapoint high low med low lowFor Breeding no no yes yes no yes
$0.5-1.00 $<0.01-0.20