The potential benefits of the potato genome
sequence and high throughput
SNP platform to breeding.
David Douches1, Candice N. Hansey2, Alicia Massa2, Kim Felcher1, Joseph Coombs1 and C. Robin Buell2.
1Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, 2Department of Plant Biology, Michigan State University, East Lansing, MI 48824,
The Potato: Our favorite vegetable
• Potatoes are the world’s 3rd most important crop, esp. developing countries
• Americans eat ~57 kg (126 pounds) of potatoes per year (fries and chips)
• Breeding is challenging, antiquated methods – Most cultivated potato are tetraploids, highly heterozygous, not all are
fertile, vegetatively propagated
• Can genomics provide insight into unique aspects of potato biology/genetics and can this be used to improve potato as a crop?
Doubled Monoploid DM 1-3 516 R44
• Doubled monoploid line DM 1-3 516 R44 of adapted Solanum tuberosum Group Phureja (from Richard Veilleux, Virginia Tech, USA)
• Reduced complexity for whole genome shotgun sequencing due to homozygosity
• Taxonomic study (Spooner et al. 2007) suggest it is same species as S. tuberosum
• Very slow growing, presumably due to increased ‘genetic load’ caused by exposure of inferior alleles to environment and homozygosity
• Genome size based on flow cytometry ~850 Mb
The Potato Genome
• Assembled DM genome (727 Mb)
• WGS of RH
• RH BAC sequences
• First asterid genome published
-39,000
genes
Potato Breeding Challenges
• Potato breeding – currently a phenotypic based process.
• “A lot” of molecular markers for a potato breeder pre-2011 was 4
• Tetraploid breeding and genetics
• Vegetative propagation
• The challenge has been for the breeder to combine the market-driven quality with the agronomic performance and host plant resistance needed by the growers.
What is SolCAP?
The SolCAP project is a coordinated agricultural project that links together people from public institutions, private institutions and industries who are dedicated to the improvement of the Solanaceae crops: potato and tomato. Through innovative research, education and extension the SolCAP project will focus on providing significant benefits to both the consumer and the environment.
The SolCAP project is supported by the Agriculture and Food Research Initiative Applied Plant Genomics CAP Program of the USDA’s National Institute of Food and Agriculture
SolCAP Overall Research Objective
• To reduce the gap between genomics and breeding SolCAP will provide infrastructure to link allelic variation of SNPs in genes to valuable traits.
– Identify up to 10,000 SNPs for tomato and potato in elite germplasm
– genotype germplasm panels and mapping populations with Illumina Infinium potato and tomato SNP arrays
The SolCAP project is supported by the Agriculture and Food Research Initiative Applied Plant Genomics CAP Program of the USDA’s National Institute of Food and Agriculture
Potato SNP Discovery
Align reads to contigs with Maq pipeline
2,263,279 SNPs
Filter SNPs for read depth, density, and quality with Maq SNPFilter
575,340 SNPs (Filtered SNPs)
Align contigs to genome sequence and link SNPs from each variety to a genomic position
80,986 SNPs
Remove SNPs that are not biallelic and filter SNPs within 50bp of intron
69,011 SNPs (High Confidence SNPs)
RNA-Seq Reads Sanger ESTs
Assemble ESTs per variety using TGICL and call SNPs
8,327 SNPs
Filter SNPs for read depth and density using custom perl script
2,358 SNPs (Filtered SNPs)
Hamilton et al., 2012
Atlantic Premier Russet
Snowden
Bintje Kennebec Shepody
• Unique oligo for each bead type
• Bead Pool is 250,000 per sample
• Random self-assembly of beads onto the chip
• Redundancy averages 15 to 30 beads of each type
• 8,303 SNPs on Illumina Infinium chip
• 24 samples per chip
SNP Detection: Infinium 8303 Potato Array
Infinium 8303 Potato Array (A genome-wide set of SNP markers)
Number of SNPs Reason Selected
3,018 In community provided
candidate genes
536 Previously identified genetic
markers
4,749 Genome-wide coverage
SNPs/100 kb
Genes/100 kb
Felcher et al., 2012
Calling SNPs with Infinium 8303 Potato Array
• SolCAP Custom potato calling file – Based on potato diversity panel, two 4x populations and one 2x population – http://solcap.msu.edu
• 3 Cluster Calling
– Good - 7412 (89.3%) – Questionable - 296 (3.6%) – Segregation - 254 (3.1%) – Bad - 341 (4.1%)
• Call Rate for only good markers (7412)
– >90% 7036 (94.9%) – >80% 228 (3.1%) – >70% 93 (1.3%) – <70% 55 (0.7%)
SolCAP Genome-wide set of SNP markers:
potato activities
• Assess concordance between map location of SNPs and the potato genome sequence location (based on pseudomolecules).
• SNP genotyping of potato diversity panel, etc.
• SNP genotyping tetraploid mapping populations and generate SNP-based genetic maps
• QTL Analysis of tetraploid populations
• Genomic insights
• Double reduction
• Residual heterozygosity
Diploid Mapping Populations for SNP concordance
• DRH (92 progeny) – DM x RH (from Virginia Tech) was selected for mapping because the
RH parent has been used extensively in potato mapping studies and genome sequencing.
• D84 (92 progeny) – DM x 84SD22 (from MSU) was selected for mapping because 84SD22
was shown to have a higher percentage of polymorphic SNPs.
D84 Chromosomes 1-6
1 2 3 4 5 6
D84 Chromosomes 7-12
7 8 9 10 11 12
Comparison of SNPs in the 2x populations DRH and D84
Number of SNPs Length
Includes Co-segregating SNPs Mapped Segregating SNPs (cM) (Mb)
Chromosome DRH D84 Common DRH D84 Common DRH D84 DRH D84
1 268 279 114 121 76 14 125 98 81 81
2 208 270 103 97 55 17 79 53 46 46
3 88 239 26 64 46 6 78 61 48 48
4 230 186 74 105 53 12 89 91 64 64
5 144 158 52 55 46 9 100 65 47 47
6 213 216 110 90 59 19 66 65 52 55
7 146 245 52 66 49 5 70 47 53 53
8 147 183 57 74 48 11 71 67 43 43
9 164 195 62 89 57 8 100 69 53 53
10 115 131 51 66 43 14 82 63 52 52
11 131 171 45 74 50 8 76 48 42 42
12 106 181 41 43 55 4 31 65 54 59
Total 1960 2454 787 944 637 127 965 792 634 642
Concordance between the genetic and the physical maps and estimated genome-wide recombination rates
• Figure of chromosome 4 (D84) showing the genetic location (cM) and the physical position (Mb) of 204 markers, and the estimated local recombination
Felcher, K., J. Coombs, A.N. Massa, C. Hansey, J. Hamilton, R. Veilleux, C. R. Buell, D. Douches (2012)
Local inversions and/or mis-ordering of the super-scaffolds
• Graph of chromosome 10 (DRH) showing the genetic location (cM) and the physical position (Mb) of 98 markers, and the estimated local recombination
Felcher, K., J. Coombs, A.N. Massa, C. Hansey, J. Hamilton, R. Veilleux, C. R. Buell, D. Douches (2012)
Local inversions and/or mis-ordering of the super-scaffolds
• Superscaffolds that appear to have mis-alignments based on comparison with the genetic maps
Felcher, K., J. Coombs, A.N. Massa, C. Hansey, J. Hamilton, R. Veilleux, C. R. Buell, D. Douches (2012)
Chromosome Superscaffold ID Type of re-arrangement
3 PGSC0003DMB000000126 order/orientation
3 PGSC0003DMB000000127 orientation
3 PGSC0003DMB000000040 orientation
4 PGSC0003DMB000000420 order
4 PGSC0003DMB000000294 order
6 PGSC0003DMB000000158 orientation
7 PGSC0003DMB000000096 orientation
7 PGSC0003DMB000000302 orientation
7 PGSC0003DMB000000076 orientation
8 PGSC0003DMB000000048 orientation
9 PGSC0003DMB000000409 order
9 PGSC0003DMB000000848 order
10 PGSC0003DMB000000106 order
10 PGSC0003DMB000000436 order
10 PGSC0003DMB000000736 order
10 PGSC0003DMB000000129 order/orientation
10 PGSC0003DMB000000379 order/orientation
10 PGSC0003DMB000000506 order
11 PGSC0003DMB000000365 order
11 PGSC0003DMB000000354 order
11 PGSC0003DMB000000549 order
11 PGSC0003DMB000000168 order/orientation
GenomeStudio Software
• Designed for use with diploid species
– Clusters are called as AA, AB, BB
• Potato is tetraploid with at least 5 marker classes
– AAAA, AAAB, AABB, ABBB, BBBB, also nulls (i.e. AAA)
Diploid model in GenomeStudio does not work for tetraploid potato!!
Calling SNPs with Infinium 8303 Potato Array
• 5 cluster custom calling using theta values – Based on potato diversity panel, two 4x populations
and one 2x population (same as 3 cluster calling)
• Summary of SNPs categories: – Total: 5031
– 5 clusters: 2645
– 4 clusters: 858
– 3 clusters: 945
– 2 clusters: 583
– 1 cluster or bad SNPs: 3272
Scoring Tetraploid Potato Five cluster calling
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
AAAB AAAA 93 93 0 0 0 0
Expected Ratio 1 1
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
AAAA AABB 37 127 23 0 0 0
Expected Ratio 1 4 1
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
AABB AAAB 10 78 73 20 0 6
Expected Ratio 1 5 5 1
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
AABB AABB 8 40 95 41 3 0
Expected Ratio 1 8 18 8 1
Scoring Tetraploid Potato Five cluster calling
Elite Potato Germplasm in North America
• First clone released in ~1850 (Rough Purple Chili)
• 150 plus years later, hundreds of released varieties
How has a century of potato breeding effected the potato genome?
What is the extent of heterozygosity in potato?
How much phenotypic divergence is observed between the market classes?
SolCAP Diversity Panel
• Diversity Panel with 250 clones – Representation of important market classes
• 69 Chip Processing • 25 Genetic Stock • 32 Pigmented • 34 Processing Russet • 38 Round White Table • 12 Species • 13 Table Russet • 27 Yellow
– Ploidy levels • 221 4x clones • 27 2x clones • 2 1x clones
Ag Canada Europe Mexico Peru
Hirsch et al., in prep
Quality of Genotypic Data
• 215 out of the 250 clones (86%) were genotyped twice and had less than 0.57% difference between the replicates in the diploid model
• Filtered SNPs to contain less than 20% missing data – 6,373 SNPs in the filtered diploid model (simplified with all 2x) – 3,763 SNPs in the filtered dosage model (4x-AAAA, 2x-AA, 1x-A)
Hirsch et al., in prep
SNP Di pl oi d Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
01000
3000
SNP Dosage Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
0500
1000
1500
Cl one Di pl oi d Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
050
100
150
200
Cl one Dosage Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
050
100
150
SNP Di pl oi d Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
01000
3000
SNP Dosage Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
0500
1000
1500
Cl one Di pl oi d Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
050
100
150
200
Cl one Dosage Fi l t er ed
Percent Missing
Frequency
0 20 40 60 80 100
050
100
150
Genetic Relationship Between Clones
Red – Chip Processing Dark Blue – Genetic Stock Purple – Pigmented Green – Processing Russet Light Blue – Round White Table Pink – Species Brown – Table Russet Yellow - Yellow
Hirsch et al., in prep
UPGMA tree from Roger’s allele frequency based distances
Divergence of Market Classes
Hamilton et al., 2012 Hirsch et al., in prep
Green – Processing Russet Brown – Table Russet
Russet germplasm groups tightly regardless of if it was bred for processing (french fries) or table markets
Species
Processing Russet
Table Russet
Chip Processing
Yellow
Pigmented
Round White Table
Diploid Breeding Line
Genetic Stock
Subgroups within Market Classes
Hirsch et al., in prep
•There is divergence of market classes •There is also subgroups within the market classes, particularly in the chippers
Divergence of Market Classes
Hirsch et al., in prep
A century of potato breeding has resulted in clear genetic differentiation of germplasm within market classes
Red – Chip Processing Dark Blue – Genetic Stock Purple – Pigmented Green – Processing Russet Light Blue – Round White Table Pink – Species Brown – Table Russet Yellow - Yellow
Percent Heterozygosity in Potato
Average percent heterozygosity in
the panel is 51.21%
Hirsch et al., in prep
Percent Heterozygosity in Potato
Heterozygosity is much lower in the species and genetic stocks
Hirsch et al., in prep
Phenotypic Evaluation
• Traits measured at two locations (Wisconsin (Janskey and Bethke) and New York (De Jong) and two replications per location in the summer of 2010
• Traits measured for biochemical composition, growth descriptors, tuber phenotypes, and processing properties
• Only tetraploid clones were phenotyped
Hirsch et al., in prep
Market Class Phenotypic Divergence
Fructose
Fructose
Co
un
t
0 2 4 6 8 10 12 14 16 18
020
40
60
80
10
01
20
14
0
Phenotypic divergence between market classes is observed for many of the traits in the expected patterns given the selective pressures placed on each market class.
Hirsch et al., in prep
0 2 4 6 8 10 12 14 16 18
140
120
100
80
60
40
20
0
Co
un
t
Fructose (mg/g) 0 5 10 15 20 25 30
100
80
60
40
20
0 C
ou
nt
Sucrose (mg/g)
Chip Processing and Processing Russet Chip Processing
Market Class Phenotypic Divergence Tuber Length
Tuber Length
Co
unt
50 70 90 110 130 150
01
020
30
40
50
60
70
Tuber appearance traits also diverged among market classes
Hirsch et al., in prep
50 70 90 110 130 150
70
60
50
40
30
20
10
0
Processing Russet and Table Russet
Tuber Length (mm)
Co
un
t
Market Class Phenotypic Divergence Yield
Yield
Cou
nt
0 1 2 3 4 5 6 7 8 9
015
30
45
60
75
90
Not all traits demonstrated clear market class divergence such as yield. Hirsch et al., in prep
Yield per 10 Plant Plot (kg) 0 1 2 3 4 5 6 7 8 9
90
75
60
45
30
15
0
Co
un
t
Ongoing Work
• Allelic composition over time
• Phenotypic divergence over time
• Tracking genes selected through pedigrees important for traits of interest
• Role of wild species in germplasm diversity
• Association mapping
Blindauer, C.A., and R. Schmid. 2010. Cytosolic metal handling in plants: determinants for zinc specificity in metal transporters and metallothioneins. Metallomics 2: 510-529.
HMA ATPases – Heavy Metal Associated transporting ATPases involved in metal transport from the cytosol MTP – Metal Tolerance Proteins involved in membrane-bound transport ZIP – Zinc/Iron Permease responsible for cellular metal ion uptake esp. in roots
“The percentage of genes coding for zinc-binding proteins in eukaryotes is estimated conservatively at around 10%.”
Gene model Putative function PGSC0003DMG402004858 C2H2L domain class transcription factor PGSC0003DMG400013784 Non-ltr retrotransposon reverse transcriptase
PGSC0003DMG400022166 SEC14 cytosolic factor PGSC0003DMG400030728 Zinc ion binding protein
PGSC0003DMG400029221 Hypothetical protein
PGSC0003DMG401002262 ATP-binding cassette transporter PGSC0003DMG400031065 Pectate lyase
Filter Fe Zn
Polymorphic 7,869 7,869
F-value < 0.02 and R2 value > 0.2
362 100
No greater than 3 missing calls
301 91
Mean difference in Fe or Zn > 5 mg/kg DW between alternate genotypes
246 70
Genotype class > 3 individuals
166 47
Genetic distances are in Mbp
Mapping Economically Important Traits in Tetraploid potato using Genome-wide SNPs
Late blight
Scab
Chip-processing Colorado potato beetle
PVY
Tetraploid Mapping in Potato MSV507
• High chip quality
– (low reducing sugar)
• High specific gravity
– (high starch)
• Scab susceptible
• Moderate reducing sugar
• Intermediate SG
• Scab resistant
Kalkaska
X Tundra
MSV507
• 200 progeny
• Germinated seed in tissue culture
• Winter 2010
• Propagated tissue culture transplants for field seed increase in year 1 (2010)
• First replicated field trials in year 2 (2011)
MSV507 2011 Field Trials
• MSV507 at Montcalm Research Center (MRC) – Randomized complete block
design – 4 reps of 5 hill plots – Planted at MRC scab nursery
• MSV507 at Lake City Research Center (LCRC) – Augmented field design
– 20 hill plots
– Seed increase
Traits for Evaluation
• Scab rating (MRC)
• Average tuber weight
• Chip color
– Out of the field
– 7°C (45°F) Storage, 6 mo.
• Specific Gravity
• Asparagine
• Acrylamide
Scab Rating Scale Breeder Scale 0-5
3.5 4.5 4.0
2.0 3.0 2.5
0.5 1.0 1.5
Plant Pathology Scab Rating
Lesion Type: 0 - No lesions 0.5 - Brown CS-like (small and star-shaped) 1 - Superficial lesions, discrete 2 - Superficial coalescing lesions 3 - Raised lesions, discrete 4 - Coalescent raised lesions 5 - Discrete pitted lesions 6 - Coalescing pitted lesions
Lesion Incidence:
MSV507 Scab Field Rating (MRC)
Num
ber
of
MSV507 P
rogeny
Mean Scab Field Rating
Ka
lkaska
1
.4
Tu
nd
ra
2.6
MSV507 Scab Pathology Incidence (MRC)
Num
ber
of
MSV507 P
rogeny
Mean Scab Pathology Rating
Tu
nd
ra 2
.6 K
alk
aska
1
.5
MSV507 Specific Gravity (MRC)
Num
ber
of
MSV507 P
rogeny
Mean Specific Gravity
Tu
nd
ra 1
.07
8
Ka
lkaska
1
.07
4
MSV507 Average Tuber Weight (MRC)
Num
ber
of
MSV507 P
rogeny
Mean Average Tuber Weight (kg)
Tu
nd
ra 0
.07
4
Ka
lkaska
0
.09
6
MSV507 Chip Rating 45F 6mo (MRC)
Num
ber
of
MSV507 P
rogeny
Mean Chip Rating 45F 6mo
Tu
nd
ra 1
.0
Ka
lkaska
1
.5
Advantage of Sequenced Potato Genome
• Using only SNPs with known pseudomolecule chromosome position
• Condordance previously evaluated in diploid populations
• Physical map becomes a reference for comparison with the genetic map
Tetraploid Mapping
• TetraploidMap Software from BioSS
• http://www.bioss.ac.uk/download/tpmap
• Windows XP
• Designed for AFLP and SSR markers
• Maximum of 800 markers per project
• Maximum of 50 markers per linkage group
• Not effective for markers with double reduction
SNPs used for mapping MSV507
Simplex, Duplex, and Triplex SNP markers Tundra
AAAA AAAB AABB ABBB BBBB
Kal
kask
a
AAAA 181 67 3
AAAB 195 18
AABB 64 82
ABBB 18 274
BBBB 17 94 263
Total Kalkaska Tundra
Simplex 913 469 444
Duplex 307 146 161
Triplex 56 36 20
Total 1276 651 625
Distribution of Simplex, Duplex, and Triplex SNPs in MSV507 by Chromosome
Kalkaska chr01, 50 SNPs with linkage phase
Kalkaska Scab QTLs
Kalkaska Chip Color QTLs
MSV507 QTL Summary
CHR LOD % Var Trait CHR LOD % Var Trait CHR LOD % Var Trait
K01 7.0 15.6 Scab Type MRC K04 4.5 34.9 Chip 45F 6mo MRC K08 3.0 5.2 SED 45F 6mo MRC
K01 6.6 14.6 Scab Field Rating MRC K04 4.0 18.4 Chip OTF LC T08 3.5 7.0 Chip 45F 6mo MRC
K01 5.1 10.9 Scab Washed Rating MRC K04 3.5 7.3 Scab Washed Rating MRC T08 3.6 7.5 SPGR LC
K01 3.6 7.4 SPGR LC K04 3.3 6.4 SED 45F 6mo MRC T08 4.5 8.8 SPGR MRC
T01 3.8 7.1 Chip 45F 6mo MRC T04 3.3 6.4 Avg tuber wt LC
T01 4.6 11.2 Scab Type MRC T04 3.5 6.7 Avg tuber wt MRC K09 NA NA No significant QTLs
T01 4.0 9.7 Scab Field Rating MRC T04 4.0 9.4 SPGR LC T09 3.0 5.0 SED 45F 6mo MRC
T01 3.9 8.0 Scab Washed Rating MRC
T01 3.7 7.7 SED 45F 6mo MRC K05 5.9 14.0 Avg tuber wt MRC K10 3.1 5.8 Scab Type MRC
T05 2.9 5.7 Chip OTF LC K10 3.1 6.5 Scab Field Rating MRC
K02 3.3 6.1 Avg tuber wt MRC K10 3.1 7.5 Scab Washed Rating MRC
K02 6.4 7.2 Chip OTF LC K06 3.7 7.2 Avg tuber wt MRC T10 3.7 12.6 SPGR LC
K02 5.4 11.1 Scab Incidence MRC K06 3.3 6.5 Chip 45F 6mo MRC
K02 3.6 7.0 SPGR LC T06 3.8 8.7 Avg tuber wt MRC K11 3.0 5.5 Avg tuber wt LC
K02 4.7 9.5 SPGR MRC K11 2.8 4.4 Avg tuber wt MRC
T02 3.0 5.5 Chip OTF LC K07 4.9 11.0 Avg tuber wt LC K11 3.2 7.0 Chip 45F 6mo MRC
T02 4.2 10.7 SPGR MRC K07 5.4 14.1 Avg tuber wt MRC K11 3.3 6.0 SED 45F 6mo MRC
K07 4.8 9.3 Scab Incidence MRC T11 3.9 7.5 Avg tuber wt LC
K03 4.2 9.6 Avg tuber wt MRC K07 4.8 11.2 SED 45F 6mo MRC T11 3.9 7.5 Avg tuber wt MRC
K03 3.7 10.0 Chip 45F 6mo MRC K07 3.5 6.6 SPGR MRC T11 5.2 11.3 SPGR MRC
K03 3.2 6.2 SED 45F 6mo MRC T07 4.4 9.1 Scab Incidence MRC
T03 3.2 6.1 Chip OTF LC K12 NA NA No significant QTLs
T12 3.4 6.3 Scab Type MRC
Other US Potato Mapping populations SNP genotyped
• Premier Russet x Rio Grande Russet (SolCAP) – Reducing sugars, processing quality, specific gravity, tuber shape
• Atlantic x Superior (UW) – (tuber calcium, reducing sugars, internal defects, specific gravity
(starch))
• B1829-5 x Atlantic (NCSU) – (chip color, internal heat necrosis, specific gravity, maturity)
• Jacqueline Lee x MSG227-2 (MSU) – (specific gravity, late blight resistance, vine maturity)
• Waneta x Pike (Cornell) – (specific gravity, chip color, disease resistance)
• W4 x 524-8 (diploid) (UW) – (specific gravity, chip color, disease resistance)
Premier Russet x Rio Grande Russet Traits Being Evaluated
• specific gravity • chip color after cold storage • sucrose/glucose
• skin texture • tuber shape (l/w/h) • eye depth • skin color, flower color • flesh color • vine maturity (95, 120 dap) • growth habit (prostrate, erect, etc.) • total yield • heat sprouts • internal defects
“The key three”
Databases and Resources
• Integrated, breeder-focused resources for genotypic and phenotypic analysis at SGN and MSU
– http://solcap.msu.edu
– http://solanaceae.plantbiology.msu.edu
– http://solgenomics.net
Breeder's Toolbox
Double Reduction in Tetraploids
• Autotetraploids can undergo double reduction that results in (the segments of) two sister chromatids being recovered in a single gamete.
• For this to occur, multivalent pairing must take place with a cross-over between a locus and its centromere followed by the two pairs of chromatids passing to the same pole in anaphase I (adjacent segregation).
Tetraploid Mapping
• Premier Russet (PR) X Rio Grande Russet (RG)
– PRRG – 184 Progeny
– Rich Novy’s population
Double Reduction Example Tetraploid Potato on Infinium Array
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
AAAA AAAB 92 86 6 0 0 3
Expected Ratio 1 1
Progeny
PR RG AAAA AAAB AABB ABBB BBBB NC
BBBB ABBB 0 0 6 87 92 2
Expected Ratio 1 1
Distribution of Simplex SNPs with Double Reduction in PRxRG
No. of SNPs
No. of DR PR RG Total
0 373 168 541
1 47 68 115
2 19 37 56
3 32 14 46
4 7 13 20
5 7 8 15
6 2 4 6
7 0 1 1
Distribution of Simplex SNPs with Double Reduction in PRxRG by chromosome and parent
Premier Russet Rio Grande
Chromosome No. SNPs No. DR SNPs No. SNPs No. DR SNPs
chr01 46 21 33 19
chr02 56 15 20 15
chr03 48 5 24 4
chr04 43 7 50 30
chr05 33 4 14 6
chr06 47 9 31 8
chr07 36 2 15 6
chr08 46 12 12 10
chr09 39 11 43 15
chr10 43 21 19 8
chr11 17 6 21 11
chr12 33 1 31 13
Total 487 114 313 145
Double Reduction in PRxRG Simplex SNPs by Pseudomolecule Chromosome Position
Is there a homozygous potato?
• Most wild tuber-bearing Solanum species are diploid (2n =2x = 24) and self-sterile due to the presence of a genetically-based gametophytic self-incompatibility system
• It has been difficult to develop inbred lines for breeding and genetics studies.
• Self-compatible genotypes have occasionally been reported
• In S. chacoense, self-compatibility is conditioned by the presence of a dominant allele of an S-locus (self-incompatibility locus) inhibitor gene (Sli)
S. chacoense S7
line 523-3
Objective
• This study was carried out to characterize the distribution of heterozygous SNPS in potato inbred lines that have been self-pollinated for 6-7 generations
Levels of Heterozygosity in SolCAP Germplasm Panel Diploids
Variety/Clone % Heterozygosity Description - Source
DM 0.02 S. phureja
VER_275255 1.22 S. verrucosum
PNT_PI_184774 1.05 S. pinnatisectum
BLB_PI_243510 2.03 S. bulbocastanum
CMM_PI472837 3.15 S. commersonii
MCD_PI_310979 4.76 S. microdontum
RAP_PI_296126 4.85 S. raphanifolium
BER_PI_458365 5.12 S. berthaultii
CHC_275139 6.86 S. chacoense
TF75.5 6.87 S. microdontum
Phu_BARD_1-3 14.07 S. phureja
PP5 14.37
A151-16 20.84
91E22 21.80
M269-HORG 26.10
BER_63 27.41
MCR_205 27.38
RH 28.24
BER_83 28.70
P067-4P 30.01
P066-4 30.70
SH83-92-488 31.11
A146-103Y 33.40
HS66 32.83
84SD22 34.15
A013-19 50.75
A133-57 51.80
P055-1Y 51.91
Levels of heterozygosity S. chacoense selfed lines
• Average SNP heterozygosity ranges from 2.1 to 10 %
• Theoretical heterozygosity should be less than 1%
• 6,931 SNPs: – 1,243 SNPs were heterozygous in
at least one line
– 72 SNPs were heterozygous in at least 80 % of the lines
– 34 heterozygous SNPs were observed in all 21 selfed lines
Inbred lines % Het
515-2 2.08 516-1 2.34 516-6 7.70 519-11 2.21 522-4 2.29 522-5 2.08 523-2 7.06 523-3 7.08 523-10 6.23 523-11 5.54 524-4 9.67 524-5 7.16 524-6 4.67 524-8 7.88 524-9 6.98 524-10 8.38 524-11 7.70 524-12 7.81 524-13 8.66 525-2 7.94 525-4 8.21
S7
S6
Levels of heterozygosity S. chacoense selfed lines
Values represent % heterozygosity
Selfed lines
Mean chr01 chr02 chr03 chr04 chr05 chr06 chr07 chr08 chr09 chr10 chr11 chr12
522-5 2.1 1.1 0.8 1.5 4.7 2.2 1 1.3 2.4 2.2 1.6 1.5 4.7
515-2 2.1 1.1 0.8 1.3 4.7 2.4 1.2 1.3 2 2.2 1.6 1.3 5.2
519-11 2.2 1.1 0.8 1.5 4.4 2.8 1.2 1.5 2.2 2.7 1.9 1.5 5.4
522-4 2.3 1.3 0.8 1.5 5 2.8 1.2 1.6 2.4 2.6 1.9 1.5 5.2
516-1 2.4 1.1 2.7 4.4 3.6 2.4 1.4 1.2 1.8 2.7 2.7 1.3 3.1
524-6 4.7 2.5 3.3 11 3.5 5.4 1.2 7.9 1.8 5.7 3 5 6.4
523-11 5.9 3.2 0.8 2.2 2.5 3.3 3.8 4.6 8.8 15.2 3.3 1.3 22.3
523-10 6.6 2.8 3.8 2.4 2.8 6.4 4.4 8.3 2.9 15.5 9 16.3 4.7
524-9 7.2 3.4 8.6 12.1 7.1 7.4 3.3 4.6 3.5 3.7 5.7 3.1 23.5
523-2 7.3 2 6.4 8.1 7.5 2.9 4.7 5.3 24.9 14.1 4.1 5.2 2.5
523-3 7.3 2 6.4 8.1 7.6 2.9 4.7 5.3 24.3 14.3 4.3 5.2 2.9
524-5 7.5 1.4 5.5 5.3 6.6 1.9 1.6 9 3.9 14.8 6 15.4 19
516-6 7.7 5.9 8.9 7.3 7.9 9.8 4.9 8.3 4.7 8.2 8.4 7.1 11.3
524-11 8.2 1.7 5.2 11.2 4.8 4.5 1.4 6.7 10.6 13 4.3 9.6 25.6
524-12 8.3 1.8 5.1 11.7 5 4.8 1.4 6.7 11 12.8 4.1 9.4 26
525-2 8.3 4.3 7.2 9.2 1.9 4.3 1 7.1 30.8 15.5 3.8 5.8 9.3
524-8 8.4 3.7 1.3 6 6.6 2.6 10.8 6.5 31 4.2 5.4 2.9 20.2
525-4 8.6 3.9 6.1 2.6 5.5 2.6 6.6 9.2 27.8 12.8 6.3 16.3 4.1
524-10 8.9 4.5 5.9 11.5 4.3 4.3 3 4.6 30 12.2 4.1 14.6 8
524-13 9.3 5.8 4 10.3 4.3 7.3 3.3 4.4 20.8 4.8 5.4 16.3 25.6
524-4 10.3 3.8 5.8 12.3 7.6 4.7 10.7 3.4 27.6 3.7 5.4 15.2 24.5
Mean 6.5 3.3 5.3 7.1 6.2 5.3 4.7 5.5 13.6 9.1 5.7 7.8 12.4
Fixed heterozygosity
Chromosome 6
516-1 519-11
522-4 523-2
523-10 524-4 424-6 525-2 525-4 525-7
chc42-5 chc39-6 chc40-3
ver4
• 34 heterozygous SNPs were observed in all selfed lines
Putative function PM-chr
Cytochrome P450 chr4
Disease resistance protein chr5
NBS-coding resistance gene protein chr5, chr9
NBS-LRR resistance protein chr1, chr7
TIR-NBS disease resistance chr9
TIR-NBS-LRR disease resistance chr11
Glycine-rich protein chr9
Late embryogenesis abundant protein (LEA) chr1
Thaumatin chr12
Receptor-like kinase chr7
Malate dehydrogenase chr9
Alcohol dehydrogenase chr5
H(+)-transporting ATPase chr6
Integral membrane protein chr9
Lipoxygenase chr1
MtN3 chr2
NAD-malate dehydrogenase chr9
Nodulin chr10
Peroxidase chr5
Protein kinase atmrk1 chr6
RNA dependent RNA polymerase chr5
Signal recognition particle receptor beta subunit chr3
Transcription repressor chr7
Conserved gene of unknown function chr6, chr7, chr10
Gene of unknown function chr2, chr3, chr10
UDP-galactose:solanidine galactosyltransferase chr7
Fixed heterozygosity
Chromosome 6
516-1 519-11
522-4 523-2
523-10 524-4 424-6 525-2 525-4 525-7
chc42-5 chc39-6 chc40-3
ver4
• 34 heterozygous SNPs were observed in all selfed lines
Putative function PM-chr
Cytochrome P450 chr4
Disease resistance protein chr5
NBS-coding resistance gene protein chr5, chr9
NBS-LRR resistance protein chr1, chr7
TIR-NBS disease resistance chr9
TIR-NBS-LRR disease resistance chr11
Glycine-rich protein chr9
Late embryogenesis abundant protein (LEA) chr1
Thaumatin chr12
Receptor-like kinase chr7
Malate dehydrogenase chr9
Alcohol dehydrogenase chr5
H(+)-transporting ATPase chr6
Integral membrane protein chr9
Lipoxygenase chr1
MtN3 chr2
NAD-malate dehydrogenase chr9
Nodulin chr10
Peroxidase chr5
Protein kinase atmrk1 chr6
RNA dependent RNA polymerase chr5
Signal recognition particle receptor beta subunit chr3
Transcription repressor chr7
Conserved gene of unknown function chr6, chr7, chr10
Gene of unknown function chr2, chr3, chr10
UDP-galactose:solanidine galactosyltransferase chr7
Pathogenesis, environmental stress, and defense response related proteins
SNP Frequency Distribution Chromosome 4
S7
S6
SNP Frequency Distribution Chromosome 8
S7
S6
Future studies
• Can we detect where selection has occurred in the genome?
• What genes might be under selection? – Limitation: insufficient recombination to identify candidate genes
• How can we apply these tools and this information to crop improvement?
• Can sub-population data based on inbred lines predict hybrid performance?
• Mapping in Elite x Elite populations
– recombination is limiting • increase the number of SNPs for mapping
• Study genomes of wild relatives
• Genome wide selection
– Limitation to GWS is establishing appropriate trait models.
Genotyping strategies to consider (balancing information, cost, time)
• Genotyping by sequencing – reduced representation ($50/sample)
• Genotyping using the Infinium array ($100 sample) • Optimized pools of 384 SNPs for community mapping
projects – (BeadXpress and Kbio platforms) eg. Tomato
http://www.extension.org/pages/61007/
• Process:
– Select SNPs based on Polymorphic information content (PIC) in target germplasm pools
– Select SNPs based on genetic map position – Fill-in based on physical position
Summary
• DR products were identified in Simplex x Nulliplex crosses. Other crosses will also allow us to study DR.
• DR is observed on all chromosomes and all arms.
• DR frequency is greater further from the centromere.
Discussion
• The distribution of residual heterozygosity in S. chacoense S6 and S7 lines is genome wide
• Only 34 of the >1000 heterozygous SNPs were heterozygous across all lines tested
• 40 % of these SNPs are from genes related to pathogenesis, environmental stress, and defense response mechanisms
• The residual heterozygosity may be due to selfing and other factors such as selection, recombination, and mutation
• The S. chacoense S7 lines are a resource for future genetic studies
Summary
• SolCAP has developed a genome-wide set of SNP markers that can be used by the breeding and genetics community
• The Infinium SNPs allow for dosage calls in heterozygous tetraploid potatoes.
• Five cluster calling of SNPs in Genome Studio adds power to marker analysis
• There has been both phenotypic and genotypic divergence between market classes
• Identification of genes associated with traits of interest and the use of marker assisted selection will allow for phenotypic improvement to proceed at a more rapid pace
Summary
• We have the tools in place to start to identify these associations with the diversity panel – Significant genotypic variation for traits of interest – Genetic variation in the population underlying the phenotypic
variation
• QTL mapping of economically important traits is initiated in tetraploid populations with simplex, duplex and triplex SNPs
• Potato QTL mapping is more feasible with a genome wide set of SNPs!
• Opportunities to improve our understanding of the potato genome and develop new breeding strategies are numerous in the genomics era!
Acknowledgments Collaborators, OSU
Heather Merk
Sung-Chur Sim
Matt Robbins
Troy Aldrich
Collaborators, MSU
C Robin Buell
John Hamilton
Dan Zarka
Kelly Zarka
Collaborators, VTU
Richard Veilleux
Industry Collaborators Cindy Lawley, Illumina
Martin Ganal, Trait
Genetics
Funding USDA/AFRI
This project is supported by the Agriculture and Food Research Initiative of USDA’s
National Institute of Food and Agriculture.
Collaborators, Cornell
Walter de Jong
Lucas Mueller
Joyce van Eck
Naama Menda
Collaborators, UCD
Allen Van Deynze
Kevin Stoffel
Collaborators, UWM
Paul Bethke
Shelley Jansky