Early generation selection in an intra population recurrent selection breeding program within a...

Early generation selection in a recurrent selection breeding program within a synthetic population

– Using genomewide markers to speed-up the process

Seminar on genomic selection 17/10/2014 Tuong-Vi Cao, UMR AGAP, CIRAD-BIOS

Genomic selection based on genome-wide genotype-phenotype relations is a promising approach for breeding :

1. to access more selection candidates (higher intensity of selection) and

2. to reduce the duration of selection cycles (maximize genetic gain/unit time)

This is even more interesting since molecular information is becoming more accessible while phenotypic information is becoming limiting, in terms of resources allocation.

The upland rice breeding program of CIAT initiated this approach and first results based on cross-validation within calibration population data) showed that it is possible to use such an approach but the accuracy is rather low globally. Some reasons are already stressed (only one year*location evaluation, only additive effects modelled).

My present contribution is about :

1. the way phenotypic predictor may be defined and modelled to take into account dominance and epistatic interactions, and

2. the way to integrate markers to reduce further the duration of selection cycles.

[Ms Ms]1/4

[Ms ms]1/2

[ms ms]1/4

392 S1:2 progenies segregating for ms gene

EEP 2010S0:1 progenies segregating :

¼ [Ms Ms] + ½ [Ms ms] + ¼ [ms ms]pl4

pl2pl3 pl1

EEP 2010Seed increase through SSD

S1:2

DNA extraction of 8 S1:2

plants and genotyping for ms locus

EEP 2011 A

EELL 2008Four synthetic populations segregating for ms gene :

½ [ms ms] + ½ [ms Ms] MS MF

PCT-4C PCT-11PCT-4AMS MF

PCT-4B

S0:1

Extraction of 100 S0:1 progenies per population

on MF plants

What has been done and what is the question ?


[Ms Ms]1/4

[Ms ms]1/2

[ms ms]1/4


EEP 2010S0:1 progenies segregating :

¼ [Ms Ms] + ½ [Ms ms] + ¼ [ms ms]pl4

pl2pl3 pl1

EEP 2010Seed increase through SSD

S1:2

DNA extraction of 8 S1:2

plants and genotyping for ms locus

EEP 2011 A

EELL 2008Four synthetic populations segregating for ms gene :

½ [ms ms] + ½ [ms Ms] MS MF

PCT-4C PCT-11PCT-4AMS MF

PCT-4B

S0:1

Extraction of 100 S0:1 progenies per population

on MF plants

[Ms Ms]1/4

[Ms ms]1/2

[ms ms]1/4


[Ms Ms]1/4

[Ms ms]1/2

[ms ms]1/4

392 S1:2 progenies segregating for ms genes

S2:3

PhenotypingS2:4Bulk seed increase

S2:3

S2:3

S2:3

DNA extraction of 15 S2:3 plants per progeny


GBS genotyping to infer the genotype of S2 plants

Phenotyping of S2:4 progenies to calibrate the model

Choice of one [Ms Ms] plant per S1:2 progeny to constitute the calibration population.

Bulk seed increase


• The S2 population as the base population structure for calibration is an option because a partially fixed material:– is more homogenous and easier to phenotype (minimum intra-

progeny variation and maximum between progeny variation) – minimizes the bias due to dominance effects.

• However, it is time and resources consuming :– to produce material to calibrate the prediction model (S2

population to be sampled, S2:3 bulks to be genotyped, S2:4 progenies to be phenotyped)

– to produce the breeding material until S2 generation before being predicted in each cycle.

• Hence, is it possible to save time & resources through :– Early phenotyping for calibrating the model ?– Early prediction of breeding candidates ?

Genetic model

• For simplicity, let us suppose two biallelic loci M and N,

• Let be a genotype in S0 generation,

• The genotypic value is

lj

ki

NM

NM

ijklDD

jklAD

iklAD

ijlAD

ijkAD

jkAA

ilAA

jlAA

ikAA

klD

ijD

lA

kA

jA

iA

ijklGS 0

Additive effects associated with alleles i or j of M locus and alleles k or l of N locus

Dominance effects associated with M and N loci respectively

Additive*additive epistasis associated with one allele of M locus and one allele of N locus

Additive*dominance epistasis associated with 2 alleles of first locus and 1 allele of second locus

Dominance*dominance epistasis associated with all alleles

Genetic model

• At meiosis, the genotype produces four gametes with frequencies depending on the recombination rate r,

• If selfed, the genotype produces ten genotypes in the S1 generation …

Gametes and their respective frequencies

kiNM 2

1 r liNM 2

r kjNM 2

r ljNM 2

1 r

Gam

etes

and

thei

r re

spec

tive

fr

eque

ncie

s

kiNM 2

1 r Giikk Giikl Gijkk Gijkl

liNM 2

r Giikl Giill Gijkl Gijll

kjNM 2

r Gijkk Gijkl Gjjkk Gjjkl

ljNM 2

1 r Gijkl Gijll Gjjkl Gjjll

Genotypic value / Genotype

Genetic model

• With respective frequencies shown below :

Genotype Frequency Giikk ¼ (1-r)² Gjjll ¼ (1-r)² Giill ¼ r² Gjjkk ¼ r² Gijkl ½ (1-r)² Gijkl ½ r² Giikl ½ r (1-r) Gijkk ½ r (1-r) Gijll ½ r (1-r) Gjjkl ½ r (1-r)

Non recombinant double homozygote genotypes

Recombinant double homozygote genotypes

Non recombinant double heterozygote genotypeRecombinant double heterozygote genotype

Partially recombinant genotypes, homozygote for one locus and heterozygote for the other locus

• The frequencies form a vector, V1, associated with the S1 generation :

¼ (1-r)² ¼ (1-r)²

¼ r² ¼ r²

½ (1-r)² ½ r²

½ r (1-r) ½ r (1-r) ½ r (1-r) ½ r (1-r)

V1=

If V2 is the vector of frequencies of the S2 generation, then one can find the relationship between V1 and V2 …

Genetic components of generation means

Genetic components of generation means• This relation is V2 = M*V1

• It holds for any couple of successive generations (Vn+1=M*Vn).

• M matrix is used to estimate genotypic values and genetic covariances between successive generations.

21

21

21

21

21

21

21

21

21

21

21

21

21

21

21

21

41

41

41

41

41

41

41

41

41

41

41

41

41

41

41

41

000)1()1(0000

000)1()1(0000

000)1()1(0000

000)1()1(0000

0000)²1(²0000

0000²)²1(0000

00)²1(²1000

00)²1(²0100

00²)²1(0010

00²)²1(0001

rrrr

rrrr

rrrr

rrrr

rr

rr

rr

rr

rr

rr

M

Ongoing questions : • Is it possible to relate the frequencies of any generation (including RILs) to the

ones of first generation directly (i.e. S0 plant or F1 cross)? • If yes, it is also possible to relate any generation mean and genetic covariance

the ones of unselfed S0 plant or F1 cross ?

Genetic components of generation means

• If successive generations are allowed to segregate and recombine until complete fixation (i.e. neither selection nor drift), the expected mean value of the RILs will be :

• Thus if r = ½ (for simplicity), the genotypic mean value of S1 progeny of a S0 plant/cross is :

ijklDD

jjklDD

ijllDD

ijkkDD

iiklDD

jjkkDD

iillDD

jjllDD

iikkDD

jklAD

iklAD

ijlAD

ijkAD

jkkAD

jjkAD

illAD

iilAD

jllAD

jjlAD

ikkAD

iikAD

jkAA

ilAA

jlAA

ikAA

llD

kkD

jjD

iiD

klD

ijD

lA

kA

jA

iA

ijklGS

4

1

8

1

16

12

1

4

1

4

14

1

2

11

jjkkDD

iillDD

jjllDD

iikkDD

jkkAD

jkkAD

illAD

iilAD

jllAD

jjlAD

ikkAD

iikAD

jkAA

ilAA

jlAA

ikAA

llD

kkD

jjD

iiD

lA

kA

jA

iA

ijklGS

4

12

1

2

12

1

ijklGS

lj

ki

NM

NM

Line value concept : definition and prediction

• Line value (LV) is the mean value of all RILs that a plant or a cross can produce through successive selfings (or haplo-diploïdisation).

• LV may be predicted by any couple of successive generations :

• If a F1 and its F2 self are both phenotyped, then [2*GF2-GF1] predicts the mean value of RILs derivable from the cross. The genetic components may be written as follows :

• This predictor equals the expected LV (S∞Gijkl) except for the DD

terms.

ijklG

ijklG

nn SS 1

*2

ijkljjklijllijkkiikljjkkiilljjlliikk

jkkjkkilliiljlljjlikkiik

jkiljlikllkkjjiilkjiFF

DDDDDDDDDDDDDDDDDD

ADADADADADADADAD

AAAAAAAADDDDAAAAGG

2

1

4

1

8

12

1

2

12

1*2 12

Line value concept : definition and prediction

• The difference in DD terms between the expected line value (S∞

Gijkl) and its prediction (2*GF2-GF1) :

–The prediction includes the quantity DD= which is associated with heterozygote structures.

–While the line value includes the quantity DD’= associated with homozygote structures.

This means that if DD=DD’=0, then the prediction of LV obtained from early generations will be exactly equal to the expected LV (S∞

Gijkl).

ijkljjklijllijkkiikl DDDDDDDDDD2

1

4

1

jjkkiilljjlliikk DDDDDDDD 8

1

Applying LV concept to RS breeding scheme : advantages & specifics aspects

• Efficient & early prediction of the potential of plants or crosses to produce performant inbred lines, even for traits with dominance and epistatic interactions.

• In the context of the CIAT rice breeding scheme, unique S0 plants can not be phenotyped properly, so successive selves can be used to construct the predictor of interest, which is [2 * S2Gijkl - S1Gijkl] or [2 * S3Gijkl - S2Gijkl], depending on the quantity of seeds needed for phenotyping (i.e. monolocal versus multilocal experimentation).

Applying LV concept to RS breeding scheme : advantages & specifics aspects

• Advantages of LV predictor compared with S2:4 predictor :– Gain in the duration of the calibration process (1 or 2 generations)– Gain in the duration of a selection cycle (prediction of S0:2

progenies instead of S2:4 progenies) – No bias due to dominance (as in single generation phenotyping)

• Specific aspects to focus on :– Bulk multiplication of seeds is mandatory (to maintain allelic

frequencies to be able to develop the equations)– The ms locus controlling male sterility is difficult to manage if

genotyping for the locus is not available to differentiate S0 plants– Number of progenies to be phenotyped is halved if equal

resources is considered (as two generations needed to be phenotyped)

Accelerating further the process using genomewide markers

• Line value may be used as phenotype in a genomic model instead of single selfed progeny value. The procedure consists in: – GBS Genotyping of S0 plants,

– Phenotyping of S1 and S2 (or S2 and S3) progenies,

• Gain at two levels compared with S2 genotyping and S2:4 phenotyping:– Calibration takes 2 generations (S1 and S2) or 3 generations (S2

and S3) instead of 4 generations

– Prediction takes place on S0 plants directly without multiplying until S2 generation

Accelerating further the process using genomewide markers

Procedure when genotyping of ms locus is available :– Genotyping of S0 plants for ms locus– GBS genotyping of S0 plants that cary [Ms Ms] genotype at ms

locus only– Seed increase of [Ms Ms] S0 plants until S2 or S3 generations– Phenotyping of S1 + S2 (or S2 + S3)

Conclusion

This procedure optimises the GS scheme for some aspects : • Calibration of the model based on very early generations• Early prediction of the breeding population (S0). This

maximizes the genetic gain par unit time. • Line value predictor are less unbiased by complex effects even

if these may be important in early generations, in particular dominance

Thank you !

Date post:	14-Dec-2014
Category:	Science
Upload:	ciat
View:	111 times
Download:	3 times

Early generation selection in an intra population recurrent selection breeding program within a...

Science