Machine Learning 4Hidden Markov Models
The Problem to Be Solved
More Specifically
Given a sequence of acoustic observations
Most probable sequence of wordsCorresponding to speaker’s intent
Two Items
SequenceThe signal is observable, the output is not.
Framed a Different Way: The Ice Cream Task
A climatologist in 2799 wants to reconstruct the weather in Baltimore during 2012
Baltimore is now under water Jacob Eisner, who lived in Baltimore in the early 21st century kept a diary.
His diary, through much historical drama, became the property of the Missouri Historical Society, a short walk from Washington University where the climatologist works.
This diary, besides containing lots of dreary stuff about emotional states, contains a record of how many ice cream cones Jason ate each day that summer.
What was the sequence of hot and cold days during the eventful summer of 2012?
Note two items:
Sequence: ice cream comesObservation: sequence of ice cream conesHidden: sequence of hot and cold days
We presume:There is a probabilistic relationship between the sequence of ice cones and the sequence of hot and cold days
Dr. Eisner 2012 (not eating ice cream)
Dr. Markov circa 1900 (not eating ice cream either)
Model of Newspaper Vending Machine as FSA
Markov Chains
• Each aij is an index into a table• Gives transition probabilities
Weather Model from Luger (p. 375)
S1 = sunny, s2 = cloudy, s3 = foggy, s4 = rainy
Invented Gender/Handedness data
Male (M) Female (F) TotalLeft (L) 5 8 13Right (R) 3 4 7Total 8 12 20
As a Hidden Markov Model
P(LLL)
+
)
(!) There must be a better way
P(LLL) = (.625 * 625 * .625 * .4 * .4 * .4) + (.625 * .625 * .667 * .4 * .4 * .6) + (.625 * .667 * .625 * .4 * .6 * .4)
(.625 * .667 * .667 * .4 * .6 * .6) + (.667 * .625 * .625 * .6 * .4 * .4) + (.667 * .625 * .625 * .6 * .4 * .6) + (.667 * .667 * .625 * .6 * .6 * .4) + (.667 * .667 * .667 * .6 * .6 * .6) = .015625 + .0250125 + .0250125 + .02669334 + .0250125 + .03751875 + .04004001 + .064096048
= .259010648
The Ice Cream HMM
A: priorsmatrix
Start Hot Cold
Start 0.0 .8 .2
Hot 0.0 .7 .3
Cold 0.0 .4 .6
B: likelihoods matrix
1 Cone 2 Cones 3 Cones
Hot .2 .4 .4Cold .5 .4 .1
Ice Cream Task
Rows labeled by prior state/conditioning event
A: priors matrix
start female male
start 0.0 .6 .4
female 0.0 .6 .4
male 0.0 .6 .4
B: likelihoods matrix
left right
Female .67 .33
Male .625 .375
Gender Task
Rows labeled by prior state/conditioning event
Forward Algorithm
Forward Trellis
bj(ot)
.0464