Computational Approaches in Epigenomics Guo-Cheng Yuan Department of Biostatistics and Computational...

Post on 20-Jan-2016

216 views 0 download

Tags:

transcript

Computational Approaches in Epigenomics

Guo-Cheng YuanDepartment of Biostatistics and Computational Biology

Dana-Farber Cancer Institute

Harvard School of Public Health

BIO506, Jan 11th, 2010

Definition

• Epigenetics refers to changes in phenotype (appearance) or gene expression caused by mechanisms other than changes in the underlying DNA sequence.

wikipedia

Epigenetic mechanisms

• Nucleosome positions

• Histone modification

• DNA methylation

Chromatin

• DNA is packaged into chromatin.

• Nucleosome is the fundamental unit of chromatin. It wraps 146 bp DNA.

• The chromatin structure is hierarchical.

Felsenfeld and Groudine 2003

Nucleosome and histone modification

First layer chromatin structure looks like “beads-on-a-string”.

A nucleosome is made of core histone proteins.

The amino acids on the N-terminus of histones can be covalently modified. Felsenfeld and Groudine 2003

DNA methylation

Alberts et al. Molecular Biology of the Cell

DNA methylation normally occurs at CpG dinucleotide only and can be inherited during cell-division.

Why do we care?

• Epigenetics is an extra layer of transcriptional control.

• Epigenetics plays an important role in development.

• Epigenetic mechanisms can cause cancer and other diseases.

• Epigenetic patterns are reversible and can be influenced by environments.

Our goalsepigenonic

data

microarray

DNA sequence

Computational model

Characterize cell-type specific epigenetic states

Elucidate epigenetic targeting

mechanism

Understand epigenetic

regulation in cell differentiation

Epigenetic signature of

diseases

TF binding

Chromatin domains

Intrachromosomal interactions

large-scale histone modification patterns

chromatin loops

A hidden Markov model for prediction of multi-gene chromatin domains

Jessica Larson

Prediction results

Targeting mechanism for epigenetic factorsNucleosome positions

Histone modification pattern

Wavelet Energy

Dinucleotide Frequency

Signal

Wavelet Basis

Signal Decomposition

E1E2

E3

kk EElinP

nucleosomeP

...

ker)(

)(log 11

An N-score model to prediction nucleosome positions

Yuan and Liu

N-score prediction in two yeast species

Lanterman et al.

Polycomb targets developmental genes in ES

Boyer et al. 2006

Polycomb

Oct4NanogSox2

expressed

repressed

Kim et al. 2008

Motif A

Motif B Motif C

AA cS AA cS

NO YES

BB cS BB cS

NO YES NO YES

CC cS CC cS

A computational model: BARTBART is a Bayesian average of regression trees

Chipman et al. 2007

Overall prediction accuracy

AUC = 0.82

all factors

5 factors

CpG

random

testing data ROC

Number of cell-types in which the gene is

targeted

Pro

pen

sity

sco

re

Spring Liu; Zhen Shao

TF network

+Polycomb

Hox

Dnmt1Hox

+

cell-type A cell-type B

An integrated network

Jess Mar

Future directions

• How do genetic and epigenetic factors work together to regulate cell-type specific gene expression?

• How does the integrated regulatory network change across cell-types?

• Are there epigenetic signatures associated with common diseases and if so what role do they have?

Acknowledgment

• Jessica Larson • Yingchun (Spring) Liu• Zhen Shao

• John Quackenbush Lab– Jess Mar

• Stuart Orkin Lab– Xiaohua Shen– Jongwan Kim

• Steve Altschuler• Ollie Rando• Jun Liu

• Claudia Adams Barr Program