+ All Categories
Home > Documents > Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini.

Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini.

Date post: 15-Dec-2015
Category:
Upload: makaila-morford
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini
Transcript

Mutations and Epimutations

A story of two cultivars and their children.Matteo Pellegrini

Nipponbare and 93-11

• Nipponbare:– Oryza sativa japonica

• Primarily Japan, China, Indonesia

• Agronomic differences:• Days to heading

• 93-11– Oryza sativa indica

• India, Bangladesh, Nepal, China

• Submerged growth

• Agronomic differences:• Seed fertility• Long grain• Taller (83 cm)

Why Study Crosses?

• Crosses of Indica and Japonica are often sterile

• Show hybrid vigor in agronomic traits

Overview

• Identify SNPs between ecotypes.– SNP generation

• Identify epiMutations between ecotypes.– Identify methyl-inheritance

• Identify allele-specific expression• Identify RNA editing

P

F1

NPB 9311

• 2 rice ecotypes: Nipponbare and 93-11• Generated BS-seq data for NPB, 93-11, and 2 reciprocal crosses

Detecting Cytosine MethylationA, Cunmethylated, Cmethylated, G, T ?

… m mm …… ACCCGTACCCGATTAG …

… ATCTGTATCCGATTAG …

Apply sodium bisulfite and amplify: Unmethylated C → T, methylated C (and A/G/T) unchanged Try to align new sequence to known reference; compare

Mapping Approach: BS Seeker

Chen et al (2010) BMC Bioinformatics

BS reads are C/T converted, so normal aligners are not applicable

Three letter alignment:

AATCGTA

CTAATCGCAGG

BS read:

Ref. genome:

TTAATTGTAGG

AATTGTA

Convert C to T

AATTGTATTAATTGTAGG

Bowtie mapping

CTAATCGCAGGAATCGTA

Restore to 4 letters

m u

Compare alignments

7

Methylation levels at single-base resolution

Calculate methylation level at each covered cytosine Methylation level= #C/(#C+#T)

5’--attgagacatcctagcgcgtggtgacaataata—-3’ttttagcgcgtggtg

cattttagtgcgtgg

tagtgcgtggtg

3/(3+0)=100%

1/(1+2)=33.3%

Ref. genome:

Workflow

• Alignments– BS-Seeker mapping of NPB and 9311 samples to NPB reference

genome.– Maps 9311 genome to NPB coordinates

• Parent genomes– Each read generates a small implied sequence fragment.– Use this to generate a parent genome.

• F1 read matching• Map reads to NPB reference genome to get location.• Compare each read to NPB and 9311 parent genomes and

determine better match.

SNP

parent1

parent2

Methylation level at CG sites

Methylation level at CG sites

BS-seq

parent1/parent2

Detecting Alelle-Specific methylation

Library statisticsMethyl-Seq Reads Mapped % Mapped Coverage

NPB 298M 134M 45% 17.58

93-11 157M 74M 47% 10.14

NPB x 93-11 594M 279M 47% 20.04

-NPB 6.51

-93-11 6.08

93-11 x NPB 543M 236M 43% 25.77 -NPB 7.45

-93-11 6.59

RNA-Seq

NPB 42M 17M 42%

93-11 42M 13M 31%

NPB x 93-11 48M 12M 26%

-NPB

-93-11

93-11 x NPB 43M 11M 25%

-NPB

-93-11

Identifying SNPs

• If sites: – > 3 reads/strand– > 90% agreement within ecotype– Strands agree with each other (compensate for Cs).– (obviously) disagree with each other.

• Will miss indels, dups, inversions, other chr rearrangements.

• Will miss long runs of SNPs ( > 3 within ~55 bp) (BS-seeker limit)

SNPs - NPB vs 93-11• 1,209,456 mutations /

306,106,830 sites with mutual base calls

• ~ 1/253 bases

• Mostly (73%) C->T (or G->A if C->T on opposite strand) or T->C & A->G if in other 93-11

A C G T

A 86,677,300

42,553

216,135

42,513

C 43,336

65,771,387

34,146

226,045

G 226,045

34,146

65,771,387

43,336

T 42,513

216,135

42,553

86,677,300

SNPs - NPB vs F1 (9N-NPB)• 12 mutations

• Are these real or false?

• Similar numbers amongst all F1 comparisons

A C G T

A 3,188,414

-

3

-

C -

2,695,005

-

3

G 2

-

2,548,205

-

T -

4

-

3,253,196

Identifying epimutations

• Use the binomial dist. to build min, max, and mean pct methylation at each C.

• Confidence intervals at 5% are min, max

As # of reads ^, interval size v

Reads

Min/m

ax

Identifying epimutations (cont)

• Called different if:– mean(sample1) <

min(sample2) & mean(sample2) > max(sample1)

1 in 300 CG sites spontaneously mutate across one generation

Epimutation rate

Epimutation clusters

9311 parent

NPB parent

NPB cross

NPB cross

9311 cross

9311 cross

Epimutation clusters II

9311 parent

NPB parent

NPB cross

NPB cross

9311 cross

9311 cross

Epimutations are enriched in regions where parents differ

Half of the epimutations between parents and crosses occur at sites where parents differ

Epimutations (continued)

• Epimutations within genes– 498 genes were significantly enriched for

epimutations– GO Term x-ecotypes indicates: ATP synthesizing

related activity (ATP synthesis coupled proton transport, hydrogen transport, ion transmembrane transport, etc).

Expression

• Many genes (~7800/25640) are differentially expressed between ecotypes.

• GO term: choroplast related terms, response to cadmiumion.

Expression cont.

• Across generations, only 78 genes differentially expressed

• Of these only 2 were differentially expressed in the parents

Allele Specific Expression

• 681 examples of allele specific expression

• Partially explain hybrid vigor?

NPB parent

NPB cross

9311 parent

9311 cross

NPB cross

9311 cross

Allele-Specific Genes Accumulate Mutations

SNP Density

All genes Allele-specific genes

And are also enriched for differentially methylated sites

Allele-specific Expression

cont.

And are also enriched for differentially methylated sites

RNA Editing

• Cytidine deamination : C to U

• Adenosine deaminase: A to I (G)

How Widespread• Recent studies indicate that

RNA editing may be more widespread than originally thought

• Others have disputed this claim (Schrider et al, PlosOne)

• In plants RNA editing is thought to take place in the mitochondria and plastids

• Is there editing in nuclear genes?

Science. 2011 Jul 1;333(6038):53-8.

RNA Editing in Rice

NPB - RNA

A C G T

NPB - DNA

A 5535334 6907 3063 2219

C 4758 4436282 4279 7054

G 3777 2437 4382636 4213

T 2210 3227 6949 5577323

Initially we found lots of examples….

On Closer Inspection…

Alignments are often off by one or more bases at splice sites

But a Few Real Ones Remain?

But more Filtering Should be done…

Position of edit site along read

Current Numbers

Conclusions

• Epimutation rates are one in 300 cytosines across one generation– Clusters of epimutations are present– Are enriched in sites where parental epigenomes differ

• Allele-specific expression is widespread and associated with– Increased SNP densities– Higher differential methylation

• Find some evidence for RNA editing but…

Acknowledgements

–Krishna Chodavarapu (Pellegrini Lab)–Suhua Feng (Steve Jacobsen Lab)–Blake Myers, Guo-liang Wang, Yulin Jia


Recommended