+ All Categories
Home > Documents > Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models...

Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models...

Date post: 25-Sep-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
45
Hidden Markov Models and Sequential Data
Transcript
Page 1: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Hidden Markov Models

and

Sequential Data

Page 2: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Sequential Data• Often arise through measurement of time

series• Snowfall measurements on successive days in Buffalo• Rainfall measurements in Chirrapunji• Daily values of currency exchange rate• Acoustic features at successive time frames in speech

recognition

• Non-time series• Nucleotide base pairs in a strand of DNA • Sequence of characters in an English sentence• Parts of speech of successive words

10

Page 3: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Sound Spectrogram of Spoken Words• “Bayes Theorem”

• Plot of the intensity of the spectral coefficients versus time index

• Successive observations of the speech spectrum are highly correlated

Page 4: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Task of making a sequence of decisions• Processes in time, states at time t are influenced by

a state at time t-1

• In many time series applications, eg financial forecasting, wish to predict next value from previous values

• Impractical to consider general dependence of future dependence on all previous observations• Complexity would grow without limit as number of

observations increases• Markov models assume dependence on most recent

observations

10

Page 5: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Model Assuming Independence• Simplest model:

• Treat as independent • Graph without links

Page 6: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Markov Model

• Most general Markov model for observations {xn}• Product rule to express joint distribution of sequence

of observations

),..|(),..( 111

1 −=∏= NN

N

nN xxxpxxp

Page 7: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

First Order Markov Model• Chain of observations {xn}

• Distribution p{xn|xn-1} is conditioned on previous observation• Joint distribution for a sequence of n variables

• It can be verified (using product rule from above) that

• If model is used to predict next observation, distribution of prediction will only depend on preceding observation and independent of earlier observations

)|()..|( 111 −− = nnnn xxpxxxp

)|()(),..( 11

11 −=∏= nn

N

nN xxpxpxxp

Page 8: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Second Order Markov Model• Conditional distribution of observation xn

depends on the values of two previous observations xn-1 and xn-2

),|()|()(),..( 211

1211 −−=∏= nnn

N

nN xxxpxxpxpxxp

• Each observation is influenced by previous two observations

Page 9: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Introducing Latent Variables• For each observation xn, introduce a latent variable zn

• zn may be of different type or dimensionality to the observed variable

• Latent variables form the Markov chain• Gives the “state-space model”

Latent variablesLatent variables

ObservationsObservations

)|()|()(),..,,..(1

11

111 nn

N

nnn

N

nnN zxpzzpzpzzxxp ∏∏

=−

=

=

• Joint distribution for this model

Page 10: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Two Models Described by this Graph

Latent variablesLatent variables

ObservationsObservations

1. If latent variables are discrete: Hidden Markov ModelObserved variables in a HMM may be discrete or continuous

2. If both latent and observed variables are Gaussian then we obtain linear dynamical systems

Page 11: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Latent variable with three discrete states

• Transition probabilities aij are represented by a matrix

Not a graphical model since the nodes are not separate Not a graphical model since the nodes are not separate variables but states of a single variablevariables but states of a single variableThis can be unfolded over time to get trellis diagramThis can be unfolded over time to get trellis diagram

Page 12: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Markov model for the production of spoken words

States represent phonemes

Production of the word: “cat”• Represented by states

/k/ /a/ /t/ • Transitions from

• /k/ to /a/• /a/ to /t/• /t/ to a silent state

• Although only the correct cat soundIs represented by model, perhapsother transitions can beintroduced, eg, /k/ followed by /t/

Markov ModelMarkov Modelfor word for word ““catcat””

/k/

/a/ /t/

10

Page 13: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

First-order Markov Models (MMs)• State at time t: ω(t)• Particular sequence of length T:

ωT = {ω(1), ω(2), ω(3), …, ω(T)}e.g., ω6 = {ω1, ω4, ω2, ω2, ω1, ω4}

Note: system can revisit a state at different steps and not every state needs to be visited

Model for the production of any sequence is described by the transition probabilities

P(ωj(t + 1) | ωi (t)) = aij

Which is a time-independentprobability of having state ωj at step (t+1) given at time t state was ωi

No requirement transitional probabilities be symmetricParticular modelParticular model

θ = {aij} Given model θ probability thatmodel generated sequence ω6 = {ω1, ω4, ω2, ω2, ω1, ω4}is

P(ω6 | θ) = a14 . a42 . a22 . a21 . a14

Can include a priori probabilityof first state asP(ω(1) = ωi )

Discrete states = nodes, Transition Discrete states = nodes, Transition probsprobs = links= linksIn firstIn first--order discrete time HMM at step t system is in state order discrete time HMM at step t system is in state ωω(t(t))State at step t +1 is a random function that depends on state atState at step t +1 is a random function that depends on state at stesteand transition probabilitiesand transition probabilities

p t p t

10

Page 14: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

First Order Hidden Markov Models

• Perceiver does not have access to the states ω(t)

• Instead we measure properties of the emitted sound

• Need to augment Markov model to allow for visible states (symbols)

Page 15: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

First Order Hidden Markov Models

• Visible States (symbols) VT = {v(1), v(2), v(3), …, v(T)}• For instance V6 = {v5, v1, v1, v5, v2, v3}

• In any state ωj(t) probability of emitting symbol vk(t) is bjk

Three hidden units in HMMThree hidden units in HMMVisible states and emissionVisible states and emissionprobabilities of visible states in redprobabilities of visible states in red

Page 16: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Hidden Markov Model Computation

• Finite State Machines with transitional probabilities– called Markov Networks

• Strictly causal: probabilities depend only on previous states

• A Markov model is ergodic if every state has non-zero probability of occuring given some starting state

• A final or absorbing state is one which if entered is never left

Page 17: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Hidden Markov Model Computation

Page 18: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Three Basic Problems for HMMs

• Given HMM with transition and symbol probabilities• Problem 1: The evaluation problem

• Determine the probability that a particular sequence of symbols VT was generated by that model

• Problem 2: The decoding problem

• Given a set of symbols VT determine the most likely

sequence of hidden states ωT that led to the observations

• Problem 3: The learning problem• Given a coarse structure of the model (no of states

and no of symbols) but not the probabilities aij and bjk

• determine these parameters

Page 19: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

• Probability that model produces a sequence VT of visible states:

where each r indexes a particular sequence• of T hidden states

• In the general case of c hidden states there will be

possible terms

)(P)|V(P)V(P Tr

r

1r

Tr

TTmax

ωω∑=

=

{ })T(),...,2(),1(Tr ωωωω =

Tcr =max

Visible sequenceVisible sequence Hidden statesHidden states

Problem 1: Evaluation Problem

Page 20: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Evaluation Problem FormulaProbability that model produces a sequence VT of visible

states:

)(P)|V(P)V(P Tr

r

1r

Tr

TTmax

ωω∑=

=

∏=

=

−=Tt

t

ttP1

Tr ))1(|)(()P( ωωω ))(|)(()|(

1∏=

=

=Tt

t

Tr

T ttvPVP ωω

Because (1) output probabilities depend only upon hidden states and (2)first order hidden Markov process. Substituting,

(1)(1) (2)(2)

∑∏=

=

=

−=maxr

1r

Tt

1t

T )1t(|)t((P ))t(|)t(v(P)V(P ωωωInterpretation: Probability of sequence VT

is equal to the sum over all rmax possible sequences of hidden states of the conditional probability that the system has made a particular transition multiplied by the probability that it then emitted the visible symbol in the target sequence.

Page 21: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Computationally Simpler Evaluation Algorithm• Calculate • recursively because each term

involves only v(t), ω(t) and ω(t-1)

Define:

)1(|)(())(|)(( −ttP ttvP ωωω

∑∏=

=

=

−=maxr

1r

Tt

1t

T )1t(|)t((P ))t(|)t(v(P)V(P ωωω

where: where: bbjkjkv(tv(t)) means the symbol probability means the symbol probability bbjkjk corresponding to corresponding to v(tv(t))

)(tjα is the probability that the model is in state is the probability that the model is in state ω jj(t(t))and has generated the target sequence and has generated the target sequence uptoupto step step t

ThereforeThereforet

Page 22: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

HMM Forward

where: where: bbjkjkv(tv(t)) means the symbol probability means the symbol probability bbjkjk corresponding to corresponding to v(tv(t))

Computational complexity of O(cComputational complexity of O(c22T)T)

TimeTime--reversedreversedVersion of Version of Forward AlgorithmForward Algorithm

Page 23: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Trellis:Trellis:Unfolding ofUnfolding ofHMM throughHMM throughtimetime

Computation of Probabilities by Forward AlgorithmComputation of Probabilities by Forward Algorithm

Page 24: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

In the In the evaluationevaluation trellistrelliswe only accumulate we only accumulate valuesvalues

In the In the decodingdecoding trellis trellis we only keep maximum we only keep maximum valuesvalues

Computation of Probabilities by Forward AlgorithmComputation of Probabilities by Forward Algorithm

Page 25: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Example of Hidden Markov ModelFour states with explicit absorber stateFour states with explicit absorber state

ω 0 ω 1 ω 2 ω 3

ω 01 0 0 0

ω 10.2 0.3 0.1 0.4

ω 20.2 0.5 0.2 0.1

ω 30.8 0.1 0 0.1

aaijij==

v0 v1 v2 v3 v4

ω 01 0 0 0 0

ω 10 0.3 0.4 0.1 0.2

ω 20 0.1 0.1 0.7 0.1

ω 30 0.5 0.2 0.1 0.2

bbjkjk==Five symbols with unique null symbolFive symbols with unique null symbol

Page 26: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Example Problem: compute probability of generating sequence VExample Problem: compute probability of generating sequence V44 = {v= {v11, v, v33, v, v22, v, v00} } Assume Assume ω 11 is the start stateis the start state

0]02.0[1)0( )1( :arrow top

)1( ofn calculatio show arrows

1for 0)0(,1)0( thus is state 0at

011010

j

j1

1

=×==

≠===

ba

jt

αα

α

ααω

aij ω 0 ω 1 ω 2 ω 3

ω 0 1 0 0 0

ω 1 0.2 0.3 0.1 0.4

ω 2 0.2 0.5 0.2 0.1

ω 3 0.8 0.1 0 0.1

bjk v0 v1 v2 v3 v4

ω 0 1 0 0 0 0

ω 1 0 0.3 0.4 0.1 0.2

ω 2 0 0.1 0.1 0.7 0.1

ω 3 0 0.5 0.2 0.1 0.2

Visible symbol Visible symbol at each stepat each step

)(tiαjkijba

Page 27: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Problem 2: Decoding Problem

• Given a sequence of visible states VT, the decoding problem is to find the most probable sequence of hidden states.

• Expressed mathematically as:• find the single “best” state sequence (hidden states)

[ ]θωωωωωωωωω

ωωω|)(),...,2(),1(),(),...,2(),1(maxarg)(ˆ),...,2(ˆ),1(ˆ

:)(ˆ),...,2(ˆ),1(ˆ

)(),...,2(),1(TvvvTPT

that such T

T=

Note that summation is changed to argmax, since we want to findthe best case

Page 28: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

ViterbiViterbiAlgorithmAlgorithm

Notes:Notes:1. If 1. If aaijij and and bbjkjk are replaced by log probabilities aare replaced by log probabilities awe add terms rather than multiply themwe add terms rather than multiply them2. Best path is maintained for each node2. Best path is maintained for each node3.3. )( TcO T

Page 29: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

ViterbiViterbiDecodingDecodingTrellisTrellis

Page 30: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Example of Decoding

ω 0 ω 1 ω 2 ω 3

ω 01 0 0 0

ω 10.2 0.3 0.1 0.4

ω 20.2 0.5 0.2 0.1

ω 30.8 0.1 0 0.1

Four states with explicit absorber stateFour states with explicit absorber state

Five symbols with unique null symbolFive symbols with unique null symbol

aaijij==

What is the most likely What is the most likely state sequence that state sequence that generated the particular generated the particular symbol sequencesymbol sequenceVV44 = {v= {v11, v, v33, v, v22, v, v00}? }? Assume Assume ω 11 is the start is the start state

v0 v1 v2 v3 v4

ω 01 0 0 0 0

ω 10 0.3 0.4 0.1 0.2

ω 20 0.1 0.1 0.7 0.1

ω 30 0.5 0.2 0.1 0.2

bbjkjk==

state

Page 31: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Example 4. HMM Decoding

Note: transition between Note: transition between ω 33 and and ω 22 is forbidden by modelis forbidden by modelYet decoding algorithm gives it a nonYet decoding algorithm gives it a non--zero probabilityzero probability

Page 32: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Problem 3: Learning Problem• Goal: To determine model parameters aij and bjk

from an ensemble of training samples

• No known method for obtaining an optimal or most likely set of parameters

• A good solution is straightforward: Forward-Backward Algorithm

Page 33: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Forward-Backward Algorithm• Instance of generalized expectation maximization

algorithm• We do not know the states that hold when the

symbols are generated• Approach is to iteratively update the weights in

order to better explain the observed training sequences

Page 34: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Backward Probabilities• is the probability that model is in state ω i(t)

and has generated the target sequence upto step t

• Analogously is the probability that that the model is in state ω i(t) and will generate the remainder of the given target sequence from t+1 to T

)(tiα

)(tiβ

Computation proceeds backward through the trellisComputation proceeds backward through the trellis

Page 35: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Backward evaluation algorithm

This is used in learning: parameter estimationThis is used in learning: parameter estimation

Page 36: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Estimating the aij and bjk

• are merely estimates of their true values since we don’t know the actual values of aij and bjk

• The probability of transition between ω i(t-1) and ω j(t) given the model generated the entire training sequence VT by any path is:

)(tiα )(tiβ

Page 37: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Calculating Improved Estimate of aij

• Numerator is the expected number of transitions between state ωi(t-1) and ωj(t)

• Denominator is the total expected number of transitions from ωi

Page 38: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Calculating Improved Estimate of bjk

• Ratio between frequency that any particular symbol vk is emitted and that for any symbol

Page 39: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Learning Algorithm• Start with rough estimates of aij and bjk

• Calculate improved estimate using the formulas above

• Repeat until sufficiently small change in the estimated values of the parameters

Page 40: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Baum-Welch or Forward-Backward Algorithm

Page 41: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Convergence• Requires several presentations of each

training sequence (fewer than 5 common in speech)

• Another stopping criterion:• Overall probability that learning model could

have generated the training data

Page 42: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

HMM Word Recognition• Two approaches:

• HMM can model all possible words• Each state corresponds to

each letter of alphabet• Letter transition

probabilities are calculated for each pair of letters

• Letter confusion probabilities are symbol probabilities

• Decoding problem gives most likely word

• Separate HMMs are used to model each word• Evaluation problem gives

probability of observation which is used as a class-conditional probability

LettersLettersaa--zz

Page 43: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

HMM spoken word recognition•• Each word, e.g., cat, dog, etc, has an associated HMMEach word, e.g., cat, dog, etc, has an associated HMM•• For a test utterance determine which model has highest probabilFor a test utterance determine which model has highest probabilityity•• HMMsHMMs for speech are leftfor speech are left--toto--right modelsright models

HMM produces aHMM produces aclassclass--conditionalconditionalprobabilityprobability

Thus it is useful to compute probability of the model given the Thus it is useful to compute probability of the model given the sequence using sequence using BayesBayes rulerule

Computed byComputed byForward algorithmForward algorithm

Prior probabilityPrior probabilityof sequenceof sequence

Page 44: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Cursive Word Recognition (not HMM)

4

56

7 82 3

1

1 32 4 5 6 7 8i[.8], l[.8] u[.5], v[.2]

w[.6], m[.3]

w[.7]

i[.7]u[.3]

m[.2]m[.1]

r[.4]

d[.8]o[.5]

Image Segment from 1 to 3 is u with 0.5 confidenceImage Segment 1 to 4 is w with 0.7 confidenceImage Segment 1 to 5 is wwith 0.6 confidence and m with 0.3 confidence

Unit is preUnit is pre--segment (cusp at bottom)segment (cusp at bottom)

Best path in graph from segment 1 to 8: w o r d

Page 45: Hidden Markov Models and Sequential Datasrihari/CSE555/Chap3.Part8.pdf · First-order Markov Models (MMs) • State at time t: ω(t) • Particular sequence of length T: ωT = {ω(1),

Summary: Key Algorithms for HMM• Problem 1: HMM Forward• Problem 2: Viterbi Algorithm

• An algorithm to compute the optimal (most likely) state sequence in a HMM given a sequence of observed outputs.

• Problem 3: Baum-Welch Algorithm• An algorithm to find HMM parameters A, B,

and Π with the maximum likelihood of generating the given symbol sequence in the observation vector


Recommended