+ All Categories
Home > Documents > Periodic clusters

Periodic clusters

Date post: 22-Feb-2016
Category:
Upload: ping
View: 28 times
Download: 0 times
Share this document with a friend
Description:
Periodic clusters. Non periodic clusters. That was only the beginning…. The human cell cycle . G1-Phase. S-Phase. G2-Phase. M-Phase. 4 3 2 1 0 -1 -2 -3 -4. Gene Expression. All genes Proliferation genes. G2/M G1/S CHR . Proportion. - PowerPoint PPT Presentation
44
Periodic clusters
Transcript
Page 1: Periodic clusters

Periodic clusters

Page 2: Periodic clusters

Non periodic clusters

Page 3: Periodic clusters

That was only the beginning…

Page 4: Periodic clusters

The human cell cycle

G1-Phase S-Phase

G2-Phase M-Phase

Page 5: Periodic clusters

The proliferation cluster genes are cell cycle periodic

5 10 15 20 25 30 35 40 45

4

3

2

1

0

-1

-2

-3

-4

G2/M G1/SCHR

Samples

Gen

e Ex

pres

sion

Disrtribution of cell cycle periodicity

00.10.20.30.40.50.60.70.8

1 2 3 4 5 6 7 8 9 10CCP score

Prop

ortio

n

All genes Proliferation genes

Page 6: Periodic clusters

200 150 100 50 TSS

NFYE2F

ELK1

CDE

CHR

The cell cycle motifs are enriched among the periodic genes

Not in the cluster, mutated in cancer

Tabach et al. Mol Sys Biol 2005

Page 7: Periodic clusters

Potential regulatory motifs in 3’ UTRs

Finding 3’ UTRs elements associated with high/low transcript stability (in yeast)

AAGCTTCC CCTACAACEntire genome

Page 8: Periodic clusters

0 5 10 15-2

-1

0

1

2

3

4

Time/tissues

Expr

essio

n

ClusteringMotif

finding

Diagnosing motifs using expression

Reverse the inference flow

Page 9: Periodic clusters

Once we reverse the inference order we can

• Enumerate and score all possible k-mer motifs• Examine the effect of “mutations” on motifs• Examine the effect of motif location within

promoter• Examine the effect of motif combinations,

distances within a combination• More?

Page 10: Periodic clusters

• …But the correlation between gene• cluster and motifs is imprecise in both directions:

• there are genes in the cluster without the motif

• and many genes with the motif do not• respond. • If gene control is multifactorial, groups of genes defined by a

common motif will not be mutually disjointed• partitioning• the data into disjoint clusters will cause loss of information.

Page 11: Periodic clusters

A k-mer enumeration method: score every possible k-mer for an association with expression level

Ag is expression level of gene gC is a basal expression level (same for all gs)The integer Nμg equals the number of occurrences of motif μ in gene gM a set of motifsFμ is the increase/decrease in expression level caused by the presence of motif μ (same for all gs)

Page 12: Periodic clusters

2 4 6 8 10 12 14-2

-1

0

1

2

3

4

Time

Expr

essi

on le

vel

2 4 6 8 10 12 14

-3

-2

-1

0

1

2

3

Time

Expr

essi

on le

vel

EC score = 0.05

EC score = 0.5ScanACE(Hughes et al.)

Motifs characterization through Expression

Coherence (EC)

Page 13: Periodic clusters

*

*

**

*

*

** *

*

**

*

** *

** *** **

** *

*****

1 2

3 4

EC1=0 EC2=0.66

EC3=0.2 EC4=0.2

Threshold distance, D

Expression coherence score, intuition

Page 14: Periodic clusters

Interaction of motifs

5 10 15

-2

0

2EC=0.05

5 10 15

-2

0

2EC=0.05

TimeTime

Expr

essi

on le

vel

Only M1 Only M2

Expr

essi

on le

vel

Time5 10 15

-2

0

2EC=0.23

M1 AND M2

G2 G2

M1 M2

Page 15: Periodic clusters

Synergistic motifs

A combination of two motifs is called ‘synergistic’ if the expression coherence score of the genes that

have the two motifs is significantly higher than the scores of the genes that have either of the motifs

SFFMcm1

Page 16: Periodic clusters

A global map of combinatorial expression control

mRPE72

SWI5

SFF '

MCM1

SFFMCM1'

ECB SCB

MCB

PAC

mRRPE

mRRSE3

GCN4

BAS1

LYS14

RAP1

mRPE34

mRPE57

mRPE6mRPE58

STRE

RPN4 ABF1

PDR

CCAPHO4

AFT1

STE12

MIG1

CSRE

HAP234

ALPHA1'

ALPHA1

ALPHA2

mRPE8

mRPE69

Heat-shockCell cycleSporulationDiauxic shiftMAPK signalingDNA damage

*High connectivity*Hubs*Alternative partners in various conditions

Pilpel et al. Nature Genetics 2001

Page 17: Periodic clusters

Deduced network Properties.

0

0.5

1

-0.5

-1

0.2

0.4

0.6

0.8

G1G2

Mbp1 Ndt80 Ume6 MCM1'

MCB MSE URS1 SCB MCM1' SFF'

Corre

latio

nEx

pres

sion

Cohe

renc

e

Fkh1

Swi4

Sufficiency

Necessity

Ho et al. Nature. 2002

TF-TF interaction

Hierarchy

Page 19: Periodic clusters

.

0

0.5

1

1.5

2

-200

-120

-40

40

120

200

-0.5

-0.4

-0.3

-0.2

-0.1

0

36 19 8 14 20 2 3 7 1 2 0Exp

ress

ion

cohe

renc

e1-

Cor

rela

tion

Dis

tanc

e in

b.p

.

mRRPE is closer

PAC is closer

ATG

ATG

ATG

ATG

ATG

ATG

Distance and orientation of motifsaffect expression profiles

Page 20: Periodic clusters

Some typical expression patterns

Page 21: Periodic clusters

A Bayesian approach (conditional probability)Xi could “1” to denote denote:

• The presences of motif m

• It’s distance from TSS is < N

• It’s on the coding strand

• It neighbors another motif m’

Or “0” otherwiseei = being expressed in patter i

Page 22: Periodic clusters

Example: two rRNA processing motifs

The two motifs Work together

The two motifs’ orientation matters

Page 23: Periodic clusters

The procedure

• Given that P(N|D)=P(N)*P(D|N) / P(D):• Search in the space of possible Ns to look for a

one that maximizes the above probability• Impossible to enumerate all possible networks• Use cross validation: partition the data into 5

gene sets, learn the rules based on all but one and test based on the left-out, each time.

Page 24: Periodic clusters

For example: what does it take to belong to expression patter (4)?

• Need to have RRPE and PAC

• If PAC is not within 140 bps from ATG , but RRPE is within 240 bps then the probability of pattern 4 is 22%

• If PAC is within 140 and RRPE is within 240 bp then 100% chance

Page 25: Periodic clusters

Inferring various logical conditions (“gates”) on motif combinations

Page 26: Periodic clusters

The Bayesian network predicts very accurately expression profiles

Page 27: Periodic clusters

Can make useful predictions in worm

Page 28: Periodic clusters
Page 29: Periodic clusters
Page 30: Periodic clusters

The modern synthetic approach

Page 31: Periodic clusters

Motif discovery from evolutionary conservation data

Page 32: Periodic clusters

S. Cerevisiae S. mikatae, S. kudriavzevii, S. bayanus). S. castellii S. Kluyveri

Their intergenicsequences average 59 to 67% identityto their S. cerevisiae orthologs in globalAlignmentsS. castellii and S. Kluyveri~40% identity to Cerevisae

Page 33: Periodic clusters

Nucleotide conservation in promoters is highest close to the TSS

TATA-containing genes

All genes

Page 34: Periodic clusters

?????

Page 35: Periodic clusters

A set of discovered motifs

Page 36: Periodic clusters

NATURE | VOL 434 | 17 MARCH 2005

Page 37: Periodic clusters

The data• Examined intergenic regions of human mouse rate and dog• ~18,000 genes• “Promoters”: 4kb centered on TSS• 3UTRs based on RNA annotations• 64 Mb, and 15 Mb in total respectively• Negative control: Introns of ~120 Mb• % of alignable sequence:

promoters: 51% (44% upstream and 58% downstream of the TSS),

3’ UTR: 73%, Introns:34%, Entire genome: 28%

Page 38: Periodic clusters

The phylogenetic trees

Questions:• How would addition of species affect analyses?• What if the sequences were not only mammalian?

Page 39: Periodic clusters

An example: a known binding site of Err-a in the GABPA promoter

Questions:• What is the

“meaning” of the other conserved positions?

Page 40: Periodic clusters

Discovery of new motifs: exhaustive enumeration of all 6-mers

Page 41: Periodic clusters

Discovery of new motifs: exhaustive enumeration of all 6-mers

Page 42: Periodic clusters

Targets of new motifs showed defined expression patterns

Page 43: Periodic clusters

Motifs often show clear positional bias – close to TSS

Page 44: Periodic clusters

Same methods to look for motifs in 3’ UTRs reveals strand-specific motifs


Recommended