Date post: | 06-Jul-2015 |
Category: |
Education |
Upload: | mauricio-parra-quijano |
View: | 318 times |
Download: | 2 times |
Mauricio Parra QuijanoFAO consultant International Treaty on Plant Genetic Resources for Nutrition and Agriculture CAPFITOGEN Program Coordinator
Tools
ColNucleo
Obtaining ecogeographical core collections based on ELC maps
Again about genetic representativeness
A B C
accggtccc accggtcgc accggtctc
A B C
A A A
A B C
AAA
A
B
B B
B
C BA
When collections are very large (>1000)…
ABB
A AA
CAB
CAB
ABB
A AA
AAA
A B
A
A
A
A
B
B B
B
C BA
A
B C
AA
AAB
B
BB
CBA
C
AA
A
A
AA
AA
Random
By genotype
By phenotype
AB B
A AA
AAA
AB B
A AA
CAB
CAB
But not real
What information should we use to select?
Characterization
Morphological
Biochemical/Molecular
Agronomic/ Physiological/ PhytopathologyEntomology
Types of core collections according to data
Random
Political / Administrative
Phenotypic (morphological)
Phenotypic (quantitative traits of agronomic interest)
Genotypic (molecular markers - neutral)
Ecogeographical (adaptation to the abiotic environment)
Mixed / Cumulative
Ecogeographical core collections
The first ideas about using information on CC using adaptation data back to 1995
Only until 2000-2010 the use of GIS became popular in RFG
In 2005 the first ELC map was created
In 2009, two eco-geographical core collections were obtained and validated
Ecogeographical core collections
Determination of representativeness
Mean Variance Matching Ranges Coefficient of variace
Ecogeographical CC vs Phenotypic CC
Determination of representativeness
What does ColNucleo offer?
Starting with an ELC map (from ELC mapas tool)
P
CSampling intensity
10%15%20%…
1000
100
What does ColNucleo offer?
Seeds availability?
Ecogeographical core collection
In addition…
Phenotypic/Genotipic validation is advisable
Perform further stepwise strategy by selecting other types of variables (descriptors)
Selecting by pheno/genotypic representativeness, not randomly
One or more
core
collections?
FIGS_R
Determination of subsets focused on traits of interest for breeders (Focused Identification of GermplasmStrategy)
Why is it so difficult to use germplasm?
Poor visibility of the germplasm collections
Lack of information on the preserved material
The available information is not very useful in practice
Limited accessibility to information
Inaccessibility to germplasm
Limited interest of breeders to use germplasm collections
Conflict of interests…
Curators Representativeness Breeders Traits
The paradox of the use of PGR
Breeders frequently find collections of 1000 entries or more
They have limited availability to test
Breeders use 100 or 150 entries at the most to evaluate a trait of particular interest, as part of their routine activity
Breeders need information (characterization / evaluation data) on the preserved germplasm to make use of it.
PGR curators prioritize efforts to preserve and, only when enough funds are available, to characterize
There are very few evaluation data (or at least available)... which consequently leads to almost random selections by breeders…
There are always little or insufficient funds to characterize and evaluate the germplasm
Low level of use, reduced interest
Gradual reduction of funds for characterizing/evaluating
Focused Identification Germplasm Strategy
Original idea from Michael Mackay (1986,1990, 1995)
Fenotype = Genotype + Environment + (GxE)
Identifies germplasm with high probability of containing genetic diversity for the trait of interest
Uses ecogeographical information for the prediction of traits occurrence as a preliminary step to field trials, where breeders ultimately confirm the existence of the trait
No previous efforts on characterization/field evaluation are required and the number of entries that are delivered to the breeders to be evaluated is reduced
Resistanc e/Tolerance = Genotype + Environment + (GxE)
Generating FIGS subcollections (≠ core collections)
Enhancing the
First approach…
Temperature
Salinity score
Elevation
Rainfall
Agro-climatic zone
Disease distribution
F I G SOCUSED DENTIFICATION OF ERMPLASM TRATEGY
Data
laye
rs s
ieve a
cce
ssio
ns
ba
sed
on
latitu
de
& lo
ngitu
de
Source: Figure from
Mackay (1995)
GIS
laye
rs /
Eco
geo
grap
hic
al v
aria
ble
s
Germplasm
FILTERED!!!
We use expert knowledge Species experts Breeders Entomologists,
phytopathologists
Second approach… modeling
Clasification method AUC Kappa Field validation
Principal ComponentRegression (PCR)
0.69 0.40 ?
Partial Least Squares (PLS) 0.69 0.41 ?
Random Forest (RF) 0.70 0.42 ?
Support Vector Machines (SVM)
0.71 0.44 ?
Artificial Neural Networks (ANN)
0.71 0.44 ?
Y = b + X1 + X2 + X3Resistance/Tolerance
Ecogeographical variables
(Genebank: ICARDA wheat collection– Trait: Stem rust (Puccinia gramini)Source: Bari et al., 2012. Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables. Genet Resour Crop Evol 59(7):1465-1481
Predict on non-eval/characterized germplasmEval/characterized of germplasm Pattern
What does FIGS_R offer?
It generates FIGS subsets via filtering
Ecogeographicalcharacterization Matrix
Pasaport data table
Elevation
Average Annual Temperature
Edaphic Organic Carbon
Topsoil pH
….….
Y
X
ECOGEO
FIGS_R characterize ecogeographically the collection using the selected variables
What does FIGS_R offer?
FIGS_R characterize ecogeographically the collection using the selected variables
It uses up to three ecogeographical variables and perform a stepwise selection
Annual Precipitation (primary variable)
Edaphic clay (secondary variable)
Slope (tertiary variable)
40
4
Intensidadde selección
What does FIGS_R offer?
FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection
It selects entries from a range of values for each variable or a proportion of the distribution of values (e.g. lower 30%), in separate processes for each variable.
PROPORTION OFTHE DISTRIBUTION
40% lower
35% higher
Lowervalue
UppervalueRANGE
What does FIGS_R offer?
FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection It selects entries from a range of values for each variable or a proportion of the
distribution of values (e.g. lower 30%), in separate processes for each variable.
It can use (depending on the user) an ELC map to try to balance the selection of accessions, taking the fraction of the distribution from each category
What does FIGS_R offer?
FIGS_R characterize ecogeographically the collection using the selected variables It uses up to three eco-geographical variables and perform a stepwise selection It selects entries from a range of values for each variable or a proportion of the
distribution of values (e.g. lower 30%), in separate processes for each variable.
Like ColNucleo, it can take into account the availability of the germplasm indicated by the curator.
One or more
FIGS
subsets?