Rapidly mapping genes related to soy- bean seed …...Populati ons Total number of seeds Seeds with...

Post on 14-Aug-2020

0 views 0 download

transcript

Rapidly mapping genes related to soy-

bean seed characters by NGS-based BSA

mapping strategy

Yong Guo

Institute of Crop Science,

Chinese Academy of Agricultural Sciences

June 19th, 2018

Outline

Background

Case study I: mapping genes controlling

soybean cotyledon color

Case Study II: mapping QTLs related to

soybean seed weight

Conclusions

Acknowledgements

Soybean is an important legume crops in the world. It provide important

sources of vegetable oil and plant proteins.

1. Background

The duplicated genome restrict gene isolation in soybean

Schmutz et al. nature, 2010, 463(7278): 178-183.

13MYA 59MYA

Stem growth habit: Dt1、Dt2

Flowering and maturity: E1-E4

SCN resistance: Rhg1、Rhg4

Salt tolerance: GmSALT3, GmCHX1

Leaflet shape: Ln

Pod shattering: qPDH1、SHAT1-5

Seed-hardness: GmHs1-1

Paleopolyploid genome

Methods for identifying genes involved in specific traits

(Takeda et al., Nature Reviews Genetics, 2008, 9(6): 444-457)

Low-throughput and time-consuming of classical approaches

Segregating population

development

Genome-wide investigation of

polymorphic molecular markers

Identification of the most relevant

candidate regions

Fine-mapping by increasing

marker density in target region

Development of physical maps

Candidate gene isolation and

validation

Jeong et al. Plant Cell 2012;24:4807-4818

Next generation sequencing make the sequencing

costs dramatically reduced

Methods for identifying genes involved in specific traits

Lindner et al., Genetics, 2012, 191(4): 1381-1386.

Methods for identifying genes involved in specific traits

Abe et al., Nature biotechnology, 2012, 30(2): 174-178

There are limit mutant resources in soybean

Rice:tens of thousand of mutants available

1536 mutants

Two soybean mutant libraries:

Fast Neutron: Bolon et al., 2011

EMS: Tsuda et al., 2015

G.max

G.soja

Resequencing

1 G.soja + 1 G.max

Resequencing

25 G.sojia+30 G.max

De novo seq

7 G.soja

250M ? 19.6M ? ? ? 712 ?

510M ? 70M ? ? ? ? ?

510M 480M 85M 15M 726 1179 16 338

Kim et al. PNAS, 2010; Li et al. BMC Genomics, 2013; Li et al. NB 2014

High genetic diversity among different soybean accessions

SNP SNP missed

in Re-seq

Small

InDel

Large

InDel CNV-gain CNV-loss G.max-

specific

G.soja-

specific

The objective of this study

Development of NGS-based BSA mapping approach in

soybean using segregating population derived from

germplasm

Validation of the reliability and efficiency of BSA-seq in

fine mapping of genes/QTLs in species with particularly

sizeable or complex genomes

Mapping of genes regulating soybean cotyledon color

and seed weight using developed NGS-based mapping

method

2. Case study I: Mapping genes controlling cotyledon color

Cotyledon color is an important morphological trait for breeding and

germplasm classification

Most of cultivated soybean showed yellow cotyledon color and only a

few exhibited green one

Qualitative trait: three inheritance patterns--maternal inheritance,

double and single gene inheritance

Yellow Green

A segregating population derived from two parental lines with distinct

cotyledon colors was developed

Development of segregating population

Populati

ons

Total

number of

seeds

Seeds with

Yellow

Cotyledon

Seeds with

Green

Cotyledon

Observat

ion Ratio

χ2

(15:1) P-value

130028-1 314 295 19 15.5:1 0.0008 0.8841503

130028-3 341 319 22 14.5:1 0.0018 0.8777637

130028-4 247 234 13 18.0:1 0.2594 0.5217025

130029-1 270 251 19 13.2:1 0.1669 0.5931627

130029-2 252 232 20 11.6:1 0.9524 0.2687178

130030-1 248 234 14 16.7:1 0.0688 0.6939535

130030-2 374 347 27 12.9:1 0.4456 0.4387141

130030-3 247 231 16 14.4:1 0.0003 0.8824539

130030-4 258 244 14 17.4:1 0.1747 0.5846935

130034-1 337 316 21 15.0:1 0.0097 0.9887781

130034-2 302 277 25 11.1:1 1.7881 0.1453782

separation ratio of cotyledon color in F1 all fit 15:1

Investigation of seed cotyledon color in 11 different plants

Soybean cotyledon color is controlled by two genes

Yellow Green

all yellow 3:1 15:1 all green

Cotyledon color in this cross was controlled by two genes and the

green cotyledon trait carried from Jiyu102 was recessive.

Construction of BSA pools for next gerneration seqeucing

Yellow Green

all yellow 3:1 15:1 all green

YC-bulk GC-bulk

ZH30 JY102

30 lines 30 lines

Four DNA samples were used to construct libraries and subjected for

whole genome sequencing using Illumina HiSeq 2500 platform

Sample ID YC-bulk GC-bulk ZH30 JY102

Clean Reads 486,749,106 467,745,622 108,602,086 84,534,592

Clean Base 61,327,351,325 58,897,998,134 13,683,105,387 10,650,695,091

Q20(%) 91.6 91.4 91.5 92.6

Q30(%) 85.1 85.0 85.1 85.7

Mapped

ratio(%) 94.7 93.8 94.8 94.6

Average depth 59X 53X 12X 9X

Coverage_ratio

_1X(%) 95.2 93.5 93.2 89.7

Coverage_ratio

_5X(%) 90.9 85.0 76.8 69.2

Coverage_ratio

_10X(%) 87.9 80.0 57.4 44.1

Summary of Illumina sequencing data

A total of 1,084,921 SNPs and 157,839 small InDel were identified between

the parental lines ZH30 and JY102

Calculation of SNP index and Δ (SNP-index)

SNP filtering: quality score >=100

read depth >=10

SNP index = Count of alternate base (JY102)/Count of reads aligned

Δ (SNP-index) = SNP index in GC-bulk - SNP index in YC-bulk

P-value in Fisher’s exact test for the each SNP locus between GC-

and YC- bulks was also calculated.

Sliding window analysis of SNP index and Δ (SNP-index)

Average SNP-index, Δ (SNP-index) and P value were calculated across

a 2-Mb genomic interval using a 10-kb sliding window

Two candidate regions with statistically significant were identified

qCC1: 54.15-56.83Mb on Chromosome 1

qCC2: 0-2.68Mb on Chromosome 11

Average SNP-index of GC-bulk

>0.9, average P-value<0.05

SNP analysis of candidate regions

qCC1 locus: 2.68 Mb interval region

2,843 SNPs between parental lines

2,284 SNPs had an index of 1.0 in the GC-bulk

251 SNPs result in changes of the coding sequences

qCC2 locus: 2.68 Mb interval region

1,237 SNPs between parental lines

870 SNPs had an index of 1.0 in the GC-bulk

102 SNPs result in changes of the coding sequences

Fine mapping of qCC1 to a 30.7-kb region

Four annotated genes

Candidate gene analysis in qCC1 region

39 SNPs: 21 in genes ,two synonymous and one non-synonymous variations.

15 small InDels: one in exon of Glyma. 01g214600 and the other in intron of

Glyma. 01g214700

Fine mapping of qCC2 to a 67.7-kb region

Nine annotated genes

15 SNPs: All SNPs could not alter amino acid sequence of encoding proteins.

One small InDels: alteration occurs in dominant parental line Zhonghuang30

Candidate gene analysis in qCC2 region

qCC1/2 are the same as previous identified D1/D2 genes

Fang et al., Plant J, 2014; Nakano et al., PCP, 2014

3. Case study II: Mapping QTLs related to seed weight

100-Seed weight is a key component of soybean yield trait

Pod number

× Seed weight

Seed size

Seed yield per plant

Seed number

Seed number per pod

Wild soybean Landrace Modern cultivar

Genetic inheritance of 100-SW in a RIL population

XL ZH28

Year Parent RIL mapping population

Zhonghuang28 Xiaoling Mean S.D Range CV%

2013 20.2 9.8 15.8 2.4 10.3-

23.8 15.4

2014 20.8±1.6 11.4±0.7 15.9 2.3 8.9-24.1 14.6

LS-bulk HS-bulk

Summary of Illumina sequencing data

A total of 1,216,848 clean SNP were identified between the parental

lines ZH28 and XL

Samples HS-bulk LS-bluk Zhonghuang28 Xiaoling

Clean_Reads 314,489,982 284,619,824 108,190,706 81,497,870

Clean_Base 39,622,603,336 35,856,859,929 13,628,408,688 10,267,871,894

Q20(%) 91.76 91.84 95.04 91.76

Q30(%) 85.45 85.36 89.00 85.42

Mapped ratio(%) 97.83 98.31 96.17 95.81

Ave_depth 35X 31X 12X 9X

Cov_ratio_

1X(%) 98.64 86.99 97.82 94.43

Cov_ratio_

5X(%) 93.48 77.53 90.7 76.46

Cov_ratio_

10X(%) 87.96 70.56 66.48 43.05

QTL-seq identified major QTLs on chromosome 20 H

SB

S

NP

-In

de

x

LS

B

SN

P-I

nd

ex

LS

B-H

SB

SN

P-I

nd

ex

The major QTL located on the tail of Chr.20

Δ (

SN

P-i

nd

ex)

Physical position: 34.22-36.75, 40.95-43.35Mb

SNP marker development for genotyping of RILs

50 SNPs were selected from candidate region and RIL population was

genotyped using Sequenom MassARRAY iPLEX platform

The QTL was mapped to a 187-kb region

19 genes were annotated in the 187kb candidate region

11 genes have non-synonymous variations between parental lines

Year 2016 2017

Left Marker SNP22 SNP22

Right Marker SNP25 SNP25

LOD 7.4923 6.2975

PVE(%) 16.8764 15.946

Add 0.9881 0.7472

4. Conclusions

Genome-wide NGS-based BSA mapping approach was

developed in soybean using segregating population

derived from germplasm

Two loci controlling cotyledon color were identified

and fine mapped to 30.7-kb and 67.7-kb interval

Two stay green genes were located in fine mapping

regions and the sequence variant of one gene was

directly identified by whole genome sequencing

A major QTL related to seed weight was identified by

NGS-based mapping method and fine mapped to 187-

kb region,19 annotated genes

5. Acknowledgements

Prof. Lijuan Qiu

Jian Song

Fulai Zhou