+ All Categories
Home > Documents > Lecture 13: Hidden Markov Model - Shuai Li

Lecture 13: Hidden Markov Model - Shuai Li

Date post: 02-Jan-2022
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
59
Lecture 13: Hidden Markov Model Shuai Li John Hopcroft Center, Shanghai Jiao Tong University https://shuaili8.github.io https://shuaili8.github.io/Teaching/VE445/index.html 1
Transcript
Page 1: Lecture 13: Hidden Markov Model - Shuai Li

Lecture 13: Hidden MarkovModel

Shuai Li

John Hopcroft Center, Shanghai Jiao Tong University

https://shuaili8.github.io

https://shuaili8.github.io/Teaching/VE445/index.html

1

Page 2: Lecture 13: Hidden Markov Model - Shuai Li

A Markov system

• There are 𝑁 states 𝑆1, 𝑆2, … , 𝑆𝑁, and the time steps are discrete, 𝑡 =0,1,2, …

• On the t-th time step the system is in exactly one of the available states. Call it 𝑞𝑡

• Between each time step, the next state is chosen only based on the information provided by the current state 𝑞𝑡

• The current state determines the probability distribution for the next state

2

Page 3: Lecture 13: Hidden Markov Model - Shuai Li

Example

• Three states

• Current state: 𝑆3

3

Page 4: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

• Three states

• Current state: 𝑆2

4

Page 5: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

• Three states

• The transition matrix

5

Page 6: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

Markovian property

• 𝑞𝑡+1is independent of 𝑞𝑡−1, 𝑞𝑡−2, … , 𝑞0 given 𝑞𝑡

• In other words:

6

Page 7: Lecture 13: Hidden Markov Model - Shuai Li

Example 2

7

Page 8: Lecture 13: Hidden Markov Model - Shuai Li

Markovian property

8

Page 9: Lecture 13: Hidden Markov Model - Shuai Li

Example

• A human and a robot wander around randomly on a grid

9

Note: N (num.states) = 18 * 18 = 324

Page 10: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

• Each time step the human/robot moves randomly to an adjacent cell

• Typical Questions:• “What’s the expected time until the human is crushed like a bug?”

• “What’s the probability that the robot will hit the left wall before it hits the human?”

• “What’s the probability Robot crushes human on next time step?”

10

Page 11: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

• The currently time is 𝑡, and human remains uncrushed. What’s the probability of crushing occurring at time 𝑡 + 1?

• If robot is blind:• We can compute this in advance

• If robot is omnipotent (i.e. if robot knows current state):• can compute directly

• If robot has some sensors, but incomplete state information• Hidden Markov Models are applicable

11

Page 12: Lecture 13: Hidden Markov Model - Shuai Li

𝑃 𝑞𝑡 = 𝑠 -- A clumsy solution

• Step 1: Work out how to compute 𝑃(𝑄) for any path 𝑄 = 𝑞1𝑞2⋯𝑞𝑡

• Step 2: Use this knowledge to get 𝑃 𝑞𝑡 = 𝑠

12

Page 13: Lecture 13: Hidden Markov Model - Shuai Li

𝑃 𝑞𝑡 = 𝑠 -- A cleverer solution

• For each state 𝑆𝑖, define 𝑝𝑡 𝑖 = 𝑃 𝑞𝑡 = 𝑆𝑖 to be the probability of state 𝑆𝑖 at time t

• Easy to do inductive computation

13

𝑎𝑖𝑗 = 𝑃 𝑞𝑡+1 = 𝑆𝑗|𝑞𝑡 = 𝑆𝑖

Page 14: Lecture 13: Hidden Markov Model - Shuai Li

𝑃 𝑞𝑡 = 𝑠 -- A cleverer solution

• For each state 𝑆𝑖, define 𝑝𝑡 𝑖 = 𝑃 𝑞𝑡 = 𝑆𝑖 to be the probability of state 𝑆𝑖 at time t

• Easy to do inductive computation

14

Page 15: Lecture 13: Hidden Markov Model - Shuai Li

Complexity comparison

• Cost of computing 𝑝𝑡 𝑖 for all states 𝑆𝑖 is now 𝑂 𝑡𝑁2

• Why?

• The first method has 𝑂 𝑁𝑡

• Why?

• This is the power of dynamic programming that is widely used in HMM

15

Page 16: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

• It’s currently time t, and human remains uncrushed. What’s the probability of crushing occurring at time t + 1

• If robot is blind:• We can compute this in advance

• If robot is omnipotent (I.E. If robot knows state at time t):• can compute directly

• If robot has some sensors, but incomplete state information• Hidden Markov Models are applicable

16

Page 17: Lecture 13: Hidden Markov Model - Shuai Li

Hidden state

• The previous example tries to estimate 𝑃 𝑞𝑡 = 𝑆𝑖 unconditionally (no other information)

• Suppose we can observe something that’s affected by the true state

17

What the robot see (uncorrupted data)

What the robot see (corrupted data)

Page 18: Lecture 13: Hidden Markov Model - Shuai Li

Noisy observation of hidden state

• Let’s denote the observation at time 𝑡 by 𝑂𝑡

• 𝑂𝑡 is noisily determined depending on the current state

• Assume that 𝑂𝑡 is conditionally independent of 𝑞𝑡−1, 𝑞𝑡−2, … , 𝑞0, 𝑂𝑡−1, 𝑂𝑡−2, … , 𝑂1, 𝑂0 given 𝑞𝑡

• In other words

18

Page 19: Lecture 13: Hidden Markov Model - Shuai Li

Example

19

Page 20: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

20

Page 21: Lecture 13: Hidden Markov Model - Shuai Li

Hidden Markov models

• The robot with noisy sensors is a good example

• Question 1: (Evaluation) State estimation: • what is 𝑃 𝑞𝑡 = 𝑆𝑖|𝑂1, … , 𝑂𝑡

• Question 2: (Inference) Most probable path: • Given 𝑂1, … , 𝑂𝑡, what is the most probable path of the states? And what is

the probability?

• Question 3: (Leaning) Learning HMMs: • Given 𝑂1, … , 𝑂𝑡, what is the maximum likelihood HMM that could have

produced this string of observations?

• MLE21

Page 22: Lecture 13: Hidden Markov Model - Shuai Li

Application of HMM

• Robot planning + sensing when there’s uncertainty

• Speech recognition/understanding• Phones → Words, Signal → phones

• Human genome project

• Consumer decision modeling

• Economics and finance

• …

22

Page 23: Lecture 13: Hidden Markov Model - Shuai Li

Basic operations in HMMs

• For an observation sequence 𝑂 = 𝑂1, … , 𝑂𝑇, three basic HMM operations are:

23

T = # timesteps, N = # states

Page 24: Lecture 13: Hidden Markov Model - Shuai Li

Formal definition of HMM

• The states are labeled 𝑆1, 𝑆2, … , 𝑆𝑁• For a particular trial, let

• 𝑇 be the number of observations

• 𝑁 be the number of states

• 𝑀 be the number of possible observations

• 𝜋1, 𝜋2, … , 𝜋𝑁 is the starting state probabilities

• 𝑂 = 𝑂1…𝑂𝑇 is a sequence of observations

• 𝑄 = 𝑞1𝑞2⋯𝑞𝑡 is a path of states

• Then is the specification of an HMM➢The definition of 𝑎𝑖𝑗 and 𝑏𝑖(𝑗) will be introduced in next page

24

Page 25: Lecture 13: Hidden Markov Model - Shuai Li

Formal definition of HMM (cont.)

• The definition of 𝑎𝑖𝑗 and 𝑏𝑖(𝑗)

25

Page 26: Lecture 13: Hidden Markov Model - Shuai Li

Example

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random

26

Page 27: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

27

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 28: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

28

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 29: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

29

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 30: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

30

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 31: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

31

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 32: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

32

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 33: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

33

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

• Let’s generate a sequence of observations:

Page 34: Lecture 13: Hidden Markov Model - Shuai Li

Example (cont.)

34

• Start randomly in state 1 or 2

• Choose one of the output symbols in each state at random.

Page 35: Lecture 13: Hidden Markov Model - Shuai Li

Probability of a series of observations

• What is 𝑃 𝑂 = 𝑃 𝑂1𝑂2𝑂3 = 𝑃 𝑂1 = 𝑋 ∧ 𝑂2 = 𝑋 ∧ 𝑂3 = 𝑍 ?

• Slow, stupid way:

• How do we compute 𝑃(𝑄) for an arbitrary path 𝑄?

• How do we compute 𝑃(𝑂|𝑄) for an arbitrary path 𝑄?

35

Page 36: Lecture 13: Hidden Markov Model - Shuai Li

Probability of a series of observations (cont.)

• 𝑃(𝑄) for an arbitrary path 𝑄

36

Page 37: Lecture 13: Hidden Markov Model - Shuai Li

Probability of a series of observations (cont.)

• 𝑃(𝑂|𝑄) for an arbitrary path 𝑄

37

Page 38: Lecture 13: Hidden Markov Model - Shuai Li

Probability of a series of observations (cont.)

• Computation complexity of the slow stupid answer:• 𝑃(𝑂) would require 27 𝑃(𝑄) and 27 𝑃(𝑂|𝑄)

• A sequence of 20 observations would need 320=3.5 billion 𝑃(𝑄) and 3.5 billion 𝑃(𝑂|𝑄)

• So we have to find some smarter answer

38

Page 39: Lecture 13: Hidden Markov Model - Shuai Li

Probability of a series of observations (cont.)

• Smart answer (based on dynamic programming)

• Given observations 𝑂1𝑂2…𝑂𝑇• Define:

• In the example, what is 𝛼2(3) ?

39

Page 40: Lecture 13: Hidden Markov Model - Shuai Li

𝛼𝑡(𝑖) : easy to define recursively

40

Page 41: Lecture 13: Hidden Markov Model - Shuai Li

𝛼𝑡(𝑖) in the example

• We see 𝑂1𝑂2𝑂3 = 𝑋𝑋𝑍

41

Page 42: Lecture 13: Hidden Markov Model - Shuai Li

Easy question

• We can cheaply compute

• (How) can we cheaply compute

• (How) can we cheaply compute

42

Page 43: Lecture 13: Hidden Markov Model - Shuai Li

Easy question (cont.)

• We can cheaply compute

• (How) can we cheaply compute

• (How) can we cheaply compute

43

Page 44: Lecture 13: Hidden Markov Model - Shuai Li

Recall: Hidden Markov models

• The robot with noisy sensors is a good example

• Question 1: (Evaluation) State estimation: • what is 𝑃 𝑞𝑡 = 𝑆𝑖|𝑂1, … , 𝑂𝑡

• Question 2: (Inference) Most probable path: • Given 𝑂1, … , 𝑂𝑡, what is the most probable path of the states? And what is

the probability?

• Question 3: (Leaning) Learning HMMs: • Given 𝑂1, … , 𝑂𝑡, what is the maximum likelihood HMM that could have

produced this string of observations?

• MLE44

Page 45: Lecture 13: Hidden Markov Model - Shuai Li

Most probable path (MPP) given observations

45

Page 46: Lecture 13: Hidden Markov Model - Shuai Li

Efficient MPP computation

• We’re going to compute the following variables

• It’s the probability of the path of length 𝑡 − 1 with the maximum chance of doing all these things OCCURING and ENDING UP IN STATE Si and PRODUCING OUTPUT O1…Ot

• DEFINE: mppt(i) = that path

• So: δt(i)= Prob(mppt(i))

46

Page 47: Lecture 13: Hidden Markov Model - Shuai Li

The Viterbi algorithm

47

Page 48: Lecture 13: Hidden Markov Model - Shuai Li

The Viterbi algorithm (cont.)

48

Page 49: Lecture 13: Hidden Markov Model - Shuai Li

The Viterbi algorithm (cont.)

49

Page 50: Lecture 13: Hidden Markov Model - Shuai Li

The Viterbi algorithm (cont.)

• Summary

50

Page 51: Lecture 13: Hidden Markov Model - Shuai Li

Recall: Hidden Markov models

• The robot with noisy sensors is a good example

• Question 1: (Evaluation) State estimation: • what is 𝑃 𝑞𝑡 = 𝑆𝑖|𝑂1, … , 𝑂𝑡

• Question 2: (Inference) Most probable path: • Given 𝑂1, … , 𝑂𝑡, what is the most probable path of the states? And what is

the probability?

• Question 3: (Leaning) Learning HMMs: • Given 𝑂1, … , 𝑂𝑡, what is the maximum likelihood HMM that could have

produced this string of observations?

• MLE51

Page 52: Lecture 13: Hidden Markov Model - Shuai Li

Inferring an HMM

• Remember, we’ve been doing things like

• That “𝜆” is the notation for our HMM parameters

• Now we want to estimate 𝜆 from the observations

• AS USUAL: We could use

52

Page 53: Lecture 13: Hidden Markov Model - Shuai Li

Max likelihood HMM estimation

• Define:

53

Page 54: Lecture 13: Hidden Markov Model - Shuai Li

Max likelihood HMM estimation

54

Page 55: Lecture 13: Hidden Markov Model - Shuai Li

Max likelihood HMM estimation

55

Page 56: Lecture 13: Hidden Markov Model - Shuai Li

Max likelihood HMM estimation

56

Page 57: Lecture 13: Hidden Markov Model - Shuai Li

EM for HMMs

• If we knew 𝜆 we could estimate EXPECTATIONS of quantities such as• Expected number of times in state 𝑖

• Expected number of transitions 𝑖 → 𝑗

• If we knew the quantities such as• Expected number of times in state 𝑖

• Expected number of transitions 𝑖 → 𝑗

• We could compute the MAX LIKELIHOOD estimate of

57

Page 58: Lecture 13: Hidden Markov Model - Shuai Li

EM for HMMs

58

Page 59: Lecture 13: Hidden Markov Model - Shuai Li

EM for HMMs

• Bad news• There are lots of local minima

• Good news• The local minima are usually adequate models of the data

• Notice• EM does not estimate the number of states. That must be given.

• Often, HMMs are forced to have some links with zero probability. This is done by setting 𝑎𝑖𝑗 = 0 in initial estimate 𝜆(0)

• Easy extension of everything seen today: • HMMs with real valued outputs

59


Recommended