Applied Bayesian Nonparametrics 3. Infinite Hidden Markov...

Post on 04-Oct-2020

2 views 0 download

transcript

Applied Bayesian Nonparametrics

3. Infinite Hidden Markov Models

Tutorial at CVPR 2012Erik Sudderth Brown University

Work by E. Fox, E. Sudderth, M. Jordan, & A. Willsky AOAS 2011: A Sticky HDP-HMM with Application to Speaker Diarization IEEE TSP 2011 & NIPS 2008: Bayesian Nonparametric Inference of Switching Dynamic Linear Models NIPS 2009: Sharing Features among Dynamical Systems with Beta Processes

Observations True mode sequence •! Markov switching models for time series data

•! Cluster based on underlying mode dynamics

Temporal Segmentation

Hidden Markov Model modes

observations

Outline Temporal Segmentation !!How many dynamical modes?

!!Mode persistence

!!Complex local dynamics

!!Multiple time series

Spatial Segmentation

!! Ising and Potts MRFs

!!Gaussian processes

Hidden Markov Models

Time

Mod

e

modes

observations

Hidden Markov Models

Time modes

observations

Hidden Markov Models

Time modes

observations

Hidden Markov Models

Time modes

observations

Issue 1: How many modes?

•! Dirichlet process (DP): !! Mode space of unbounded size !! Model complexity adapts to

observations

•! Hierarchical: !! Ties mode transition

distributions !! Shared sparsity

Time

Mod

e

Infinite HMM: Beal, et.al., NIPS 2002 HDP-HMM: Teh, et. al., JASA 2006

Hierarchical Dirichlet Process HMM

•! Global transition distribution:!

HDP-HMM

sparsity of ! is shared

•! Mode-specific transition distributions:!

Hierarchical Dirichlet Process HMM

Issue 2: Temporal Persistence

Hidden Markov Model

True mode sequence HDP-HMM inferred mode sequence

Sticky HDP-HMM

Time

Mod

e

Sticky HDP-HMM

mode-specific base measure

Increased probability of self-transition

sticky original

Infinite HMM: Beal, et.al., NIPS 2002

Direct Assignment Sampler •! Marginalize:

!! Transition densities !! Emission parameters

•! Sequentially sample:

Conjugate base measure "" closed form

Chinese restaurant

prior likelihood

Collapsed Gibbs Sampler

Splits true mode, hard to

merge

•! Approximate HDP: "! Average transition density "! (" transition densities)

•! Sample:

Blocked Resampling HDP-HMM weak limit approximation HDP-HMM weak limit approximation

•! Compute backwards messages:

•! Block sample as:

" Average transition density " (" transition densities)

Results: Gaussian Emissions

Blocked sampler

HDP-HMM Sticky HDP-HMM

Sequential sampler

Sticky HDP-HMM

HDP-HMM

Results: Fast Switching

Observations

True mode sequence

Hyperparameters •! Place priors on hyperparameters and infer them from data •! Weakly informative priors •! All results use the same settings

hyperparameters

can be set using the data

Related self-transition parameter: Beal, et.al., NIPS 2002

HDP-HMM: Multimodal Emissions

•! Approximate multimodal emissions with DP mixture

•! Temporal mode persistence disambiguates model

modes

mixture components

observations

John Jane Bob John Bob

Ji l l

0 1 2 3 4 5 6 7 8 9 10

x 104

-30

-20

-10

0

10

20

30

40

Speaker Diarization

Results: 21 meetings

Overall DER

Best DER

Worst DER

Sticky HDP-HMM 17.84% 1.26% 34.29% Non-Sticky HDP-HMM

23.91% 6.26% 46.95%

ICSI 18.37% 4.39% 32.23%

0 10 20 30 40 500

10

20

30

40

50

Sticky DERsIC

SI D

ERs

! "! #! $! %! &!!

"!

#!

$!

%!

&!

'()*+,-./01

234

'()*+,-./0

1

Results: Meeting 1

Sticky DER = 1.26%

ICSI DER = 7.56%

Results: Meeting 18

Sticky DER = 20.48%

ICSI DER = 22.00%

4.81%

= set of dynamic parameters

Issue 3: Complex Local Dynamics

•! Discrete clusters may not accurately capture high-dimensional data

•! Autoregressive HMM: Discrete-mode switching of smooth observation dynamics

Switching Dynamical Processes

modes

observations

Linear Dynamical Systems •! State space LTI model:

•! Vector autoregressive (VAR) process:

Linear Dynamical Systems •! State space LTI model:

State space models

VAR processes

•! Vector autoregressive (VAR) process:

Switching Dynamical Systems Switching linear dynamical system (SLDS):

Switching VAR process:

HDP-AR-HMM and HDP-SLDS HDP-AR-HMM HDP-SLDS

Dancing Honey Bees

Honey Bee Results: HDP-AR(1)-HMM

Sequence 1 Sequence 2 Sequence 3

HDP-AR-HMM: 88.1% SLDS [Oh]: 93.4%

HDP-AR-HMM: 92.5% SLDS [Oh]: 90.2%

HDP-AR-HMM: 88.2% SLDS [Oh]: 90.4%

Issue 4: Multiple Time Series

•! Goal: !!Transfer knowledge between related time series !!Allow each system to switch between an

arbitrarily large set of dynamical modes •! Method:

!!Beta process prior !!Predictive distribution: Indian buffet process

IBP-AR-HMM •! Latent features determine

which dynamical modes are used

•! Beta process prior: !! Encourages sharing !! Unbounded features

Features/Modes

Sequ

ence

s

!

Motion Capture

CMU MoCap: http://mocap.cs.cmu.edu/ 6 videos of exercise routines:

Library of MoCap Behaviors

!"#$%&'()*+$,-'-.*'$ /0122*$*'$-34$05$678$

!"#$%&'()#*(*+,-#."#/0(1#'+23#0'2%4'#2+*'5(067#8+*'5(06#,+9',)#:(*#40($%&'&#*(#;<=>??7#@')A,*)#/0(1#>A53')#B#CA&&'0*3-#<D8E#F(0G)3(4-#8E<@#H".H7#

!"#$%&'()'!)%&*+)+,#-.+/)

,&.(19*2*:$%&'()*+$;*)-9&12.$

I(2+*%(:)#(/#)','2*#9'3+$%(0)#+20())#+,,#$%&'()#

!"#$%&'()'!)%&*+)+,#-.+/)

<&=)'$>?&'()$<0(*(2(,#0'JA%0')#)K%*23%:5#,%53*#(:L(//#+*#)*+0*#+:&#M%:%)3#

,&.(19*2*:$%&'()*+$;*)-9&12.$

!"#$%&'()'!)%&*+)+,#-.+/)

@A*+$B2&:=*$<%NN+#:''&)#23'')'-#C+:&K%23#:''&)#O',,6#*(#9'5%:#40'4+0+*%(:#

,&.(19*2*:$%&'()*+$;*)-9&12.$

!"#$%&'()'!)%&*+)+,#-.+/)

>*'$@9*+$;(*3#<%NN+)#+:&#;0(K:%')#:''&#*(#9'#9+G'&#*(#2(:2,A&'#40'4+0+*%(:#

,&.(19*2*:$%&'()*+$;*)-9&12.$

!"#$%&'()'!)%&*+)+,#-.+/)

C2-'*$!)**.*$D:,6#%:#<%NN+#E%&'()#

,&.(19*2*:$%&'()*+$;*)-9&12.$

!"#$%&'()'!)%&*+)+,#-.+/)

>'&2$;1?3$P:%JA'#*(#;0(K:%')-#9A*#1A,*%4,'#)*6,')#'Q%)*R#C*+99%:5#$)7#)K%0,%:5-#'*27#

,&.(19*2*:$%&'()*+$;*)-9&12.$