+ All Categories
Home > Documents > Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob....

Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob....

Date post: 10-Oct-2018
Category:
Upload: vuongmien
View: 226 times
Download: 0 times
Share this document with a friend
25
© Ron Shamir, CG’08 1 Hidden Markov Models
Transcript
Page 1: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 1

Hidden Markov Models

Page 2: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 2

• Dr Richard Durbin is a graduate in mathematics from Cambridge University and one of the founder members of the Sanger Institute. He has also held carried out research at the Laboratory of Molecular Biology in Cambridge and at Harvard and Stanford Universities in the USA. He is currently head of the informatics division in the Sanger Center.

Main source: Durbin et al.,

“Biological Sequence Alignment”

(Cambridge, ‘98)

Page 3: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 3

The occasionally dishonest casino

13652656643662612564

13652656643662612564 PA(1) =

PA(2) =

… = 1/6

PB(1)=0.1

...

PB(5)=0.1

PB(6) =0.5

PA->B =

PB->A =

1/2

A B

Can we tell when the loaded die is used?

Page 4: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 4

Example - CpG islands • CpG islands:

– DNA stretches (100~1000bp) with frequent CG pairs (contiguous on same strand).

– Rare, appear in significant genome parts. • Problem (1): Given a short genome sequence,

decide if it comes from a CpG island.

Page 5: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

Preliminaries: Markov Chains

(S, A, p) • S: State set • p: Initial state prob. vector {p(x1=s)} • A: Transition prob. matrix ast = P(xi=t | xi-1=s) Assumption: X=x1…xn is a random process with

memory length 1, i.e.: siS P(xi=si | x1=s1,…,xi-1=si-1) = P(xi=si | xi-1=si-1) = asi-1,si • Sequence probability: P(X) = p(x1) · i=2…Laxi-1,xi

Can avoid p by adding

0 ‘begin’ state +

transition probs A0*

Page 6: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 7

Sequence probability T G C A - 0.210 0.285 0.205 0.300 A 0.302 0.078 0.298 0.322 C 0.208 0.298 0.246 0.248 G 0.292 0.292 0.239 0.177 T

P(X) = p(x1) · i=2…Laxi-1,xi

Page 7: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 8

Markov model - Example

• Markov model,

• Adding “begin” and “end” states

G C

T A

B E

Page 8: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 9

Andrei Andreyevich Markov

• Born: 14 June 1856 in Ryazan, Russia

• Died: 20 July 1922 in Petrograd (now St Petersburg), Russia

• Seminal contributions to – central limit

theorem – stochastic processes – random walks,….

http://www-groups.dcs.st-and.ac.uk/~history/

Page 9: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 10

Markov Models • - Transition probs for non-CpG islands

• + Transition probs for CpG islands

TGCA+

0.1200.4250.2740.180A

0.1880.2740.3680.171C

0.1250.3750.3390.161G

0.1820.3840.3550.079T

T G C A - 0.210 0.285 0.205 0.300 A 0.302 0.078 0.298 0.322 C 0.208 0.298 0.246 0.248 G 0.292 0.292 0.239 0.177 T

Page 10: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 11

CpG islands: Fixed Window

• Problem (1): Given a short genome sequence X, decide if it comes from a CpG island.

• Solution: Model by a Markov chain. Let

– a+st: transition prob. in CpG islands,

– a-st: transition prob. outside CpG islands.

Decide by log-likelihood ratio score:

n

i ,xx

,xx

ii

ii

a

a

islandCpGnonXP

islandCpGXPXscore

11

1log)|(

)|(log)(

n

i ,xx

,xx

ii

ii

a

a

nXscorebits

1

2

1

1log1

)(_

Page 11: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 12

Discrimination of sequences via Markov Chains

Durbin et. al, Fig. 3.2

48 CpG islands, tot length ~60K nt. Similar non-CpG.

Page 12: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 13

CpG islands – the general case

• Problem(2): Detect CpG islands in a long DNA sequence.

• Naive Solution - Sliding windows: 1 k L-l,

– window: Xk = (xk+1,…,xk+l)

– score: score(Xk)

– positive score potential CpG island

Disadvantage: what is the length of the islands? How do we identify transitions?

Idea: Use Markov chains as before, with additional (hidden) states

Page 13: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 14

Hidden Markov Model (HMM)

path =1,…,n (sequence of states - simple Markov chain)

Given sequence X = (x1,…,xL):

• akl = P(i=l | i-1=k),

• ek(b) = P(xi=b | i=k)

Alphabet of symbols Example: {A, C, G, T}

Finite set of states, capable

of emitting symbols. Example:

Q = {A+,C+,G+,T+,A-,C-,G-,T-}

=(A,E)

A: Transition

prob. akl k,lQ

E: Emission

prob. ek(b) kQ,

b

Joint prob. of observed sequence

X and path (convention: 0 - begin, L+1 - end)

M=(, Q, )

P(X,) = a0,1·i=1…Lei(xi) ·ai,i+1

Goal: Finding path * maximizing P(X,)

Page 14: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 15

Viterbi’s Decoding Algorithm (finding most probable state path)

vk(i) = prob. of most probable path ending in state k at step i.

Init: v0(0) = 1; vk(0)=0 k>0 Step: vl(i+1)=el(xi+1)·maxk{vk(i)·akl} End: P(X, *) = maxk{vk(L) · ak0}

Time complexity: O(Ln2) for n states, m symbols, L steps

Can find * using back pointers.

Want: path maximizing P(X, )

Page 15: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 16

The occasionally dishonest casino (2)

13652656643662612564

13652656643662612564 A

B

emission probabilities

Page 16: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG ‘08 17

The occasionally dishonest casino (2)

Page 17: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 18

HMM for CpG Islands • States: A+ C+ G+ T+ A- C- G- T-

• Symbols: A C G T A C G T

• Path =1,…,n: sequence of states

TGCA+

0.1200.4250.2740.180A

0.1880.2740.3680.171C

0.1250.3750.3390.161G

0.1820.3840.3550.079T

TGCA-

0.2100.2850.2050.300A

0.3020.0780.2980.322C

0.2080.2980.2460.248G

0.2920.2920.2390.177T

transition prob. http://www.cs.huji.ac.il/~cbio/handouts/class4.ppt

Page 18: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 19

HMM for CpG Islands

G- C-

T- A-

G

+ C +

T + A+

Page 19: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 20

Posterior State Probabilities Goal: calculate P(i=k | X)

• Our strategy: • P(X, i=k) = = P(x1,…,xi, i=k) · P(xi+1,…,xL | x1,…,xi, i=k) = P(x1,…,xi, i=k) · P(xi+1,…,xL | i=k) • P(i=k | X) = P(i=k, X) / P(X) Need to compute these two terms - and P(X)

Page 20: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 21

Forward Algorithm

Goal: calculate P(X) = P(X, ) Approximation: take max path * from Viterbi alg. Not justified when several near maximal paths Exact alg : (a.k.a. “Forward Algorithm”) fk(i) = P(x0,…,xi, i=k) • Init: f0(0) = 1; fk(0)=0 k>0 • Step: fj(i+1) = ej(xi+1) · k fk(i)·akj

• End: P(X) = k fk(L)·ak0

Page 21: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 22

Backward Algorithm

• bk(i) = P(xi+1,…xL | i=k)

• init: k, bk(L) = ak0

• step: bk(i) = l akl·el(xi+1)·bl(i+1)

• End: P(X) = k a0k·ek(x1)·bk(1)

Page 22: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 23

Posterior State Probabilities (2)

Goal: calculate P(i=k | X) • Recall:

– fk(i) = P(x0,…,xi , i=k) – bk(i) = P(xi+1,…xL | i=k) – Each can be used to compute P(X)

• P(X, i=k) = = P(x1,…,xi, i=k) · P(xi+1,…,xL | x1,…,xi, i=k) = P(x1,…,xi, i=k) · P(xi+1,…,xL | i=k) = fk(i) · bk(i) • P(i=k | X) = P(i=k, X) / P(X)

Page 23: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 24

Durbin et al. pp. 60

Dishonest Casino (3)

Page 24: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 25

e.g., CpG island

S={A+,C+,G+,T+}

Posterior Decoding

• Now we have P(i=k | X). How do we decode?

1. i*=argmaxk P(i=k | X)

– Good when interested in state at particular point

– path of states 1*,.., L

* may not be legal

2. Define a function of interest g(i) on the states. Compute G(i|X) = k P(i=k | X) · g(k) • E.g.: g(i) =1 for states in S, 0 on the rest: G(i|X)

is posterior prob of symbol i coming from S

Page 25: Hidden Markov Modelsrshamir/algmb/presentations/HMM-1stLec.pdf · 1 =s)} •A: Transition prob. matrix a st = P(x i =t | x i-1 =s) Assumption: X=x 1…x n is a random process with

© Ron Shamir, CG’08 26

Andrew Viterbi • Dr. Andrew J. Viterbi is a pioneer in the

field of Wireless Communications. He received his Bachelors and Masters degrees from MIT, and his Ph.D. in digital communications from the University of Southern California (USC). He taught at UCLA and consulted for the Jet Propulsion Laboratory (JPL) Immediately after obtaining his Ph.D. He was a co-founder of Linkabit in 1968, a small military contractor, and co-founded QualComm with Irwin Jacobs in 1985. He created the Viterbi Algorithm for interference suppression and efficient decoding of a digital transmission sequence, used by all four international standards for digital cellular telephony. QualComm is the recognized pioneer of the Code Division Multiple Access (CDMA) digital wireless technology, which allows many users to share the same radio frequencies, and thereby increase system capacity many times over analog system capacity. He is a Life Fellow of the IEEE, and was inducted as a member of the National Academy of Engineering in 1978 and of the National Academy of Sciences in 1996. http://www.ieee.org/organizations/history

_center/comsoc/viterbi.html


Recommended