+ All Categories
Home > Documents > Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles...

Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles...

Date post: 18-Jan-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
140
Type-Based MCMC NAACL 2010 – Los Angeles Percy Liang Michael I. Jordan Dan Klein
Transcript
Page 1: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Type-Based MCMC

NAACL 2010 – Los Angeles

Percy Liang Michael I. Jordan Dan Klein

Page 2: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Type-Based MCMC

NAACL 2010 – Los Angeles

Percy Liang Michael I. Jordan Dan Klein

Page 3: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

2

Page 4: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

2

Page 5: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

θ

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

2

Page 6: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

2

Page 7: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

2

Page 8: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)

2

Page 9: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

2

Page 10: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

2

Page 11: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

Structural dependencies

2

Page 12: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

Structural dependencies

dynamic programming

2

Page 13: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

Structural dependencies

dynamic programming

New: type-based

2

Page 14: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

Structural dependencies

dynamic programming

New: type-based

Parameter dependencies

2

Page 15: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Learning latent-variable models

· · · DT NNS VBD · · ·

· · · some stocks rose · · ·

· · · NN NNS VBD · · ·

· · · tech stocks fell · · ·

· · · DT NNS VBD · · ·

· · · the stocks soared · · ·

· · · WRB NNS VBD · · ·

· · · how stocks dipped · · ·

· · · DT NNS VBD · · ·

· · · many stocks created · · ·

Variables are highly dependent

Simple: token-based (e.g., Gibbs sampler)⇒ local optima, slow mixing

Classic: sentence-based (e.g., EM)

Structural dependencies

dynamic programming

New: type-based

Parameter dependencies

exchangeability

2

Page 16: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

3

Page 17: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

3

Page 18: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

3

Page 19: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

Token-based: update only one variable at a time

3

Page 20: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

NN VBZ VBG NNworker demands meeting resistance

Token-based: update only one variable at a time

3

Page 21: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

NN VBZ VBG NNworker demands meeting resistance

Token-based: update only one variable at a time

3

Page 22: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

NN VBZ VBG NNworker demands meeting resistance

Token-based: update only one variable at a timeProblem: need to go downhill before uphill

3

Page 23: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Structural dependencies

Dependencies between adjacent variables

Sentence-based: update all variables in a sentence

prob

abili

ty

NN VBZ NN NNworker demands meeting resistance

NN NNS VBG NNworker demands meeting resistance

NN VBZ VBG NNworker demands meeting resistance

Token-based: update only one variable at a timeProblem: need to go downhill before uphill

3

Page 24: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

4

Page 25: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

4

Page 26: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

4

Page 27: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based:

4

Page 28: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based:

4

Page 29: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based:

4

Page 30: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based: need to go downhill a lot before going uphill

4

Page 31: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

Type-based: update all variables of one type

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based: need to go downhill a lot before going uphill

4

Page 32: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

Type-based: update all variables of one type

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based: need to go downhill a lot before going uphill

1. Parameter dependencies create deeper valleys

4

Page 33: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Parameter dependencies

Dependencies between variables with shared parameters

Type-based: update all variables of one type

prob

abili

ty

VBZ VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...NNS VBD ...stocks shot ...

NNS VBD ...stocks rose ...VBZ VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...VBZ VBD ...stocks took ...VBZ VBD ...stocks shot ...

NNS VBD ...stocks rose ...NNS VBD ...stocks fell ...NNS VBD ...stocks took ...VBZ VBD ...stocks shot ...

Token-based: need to go downhill a lot before going uphill

1. Parameter dependencies create deeper valleys

2. Sentence-based cannot handle these dependencies

4

Page 34: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

What exactly is a type?

How can we update all variables of a type efficiently?

5

Page 35: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

6

Page 36: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

6

Page 37: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

6

Page 38: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

6

Page 39: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables8 9

· · · V N · · ·stocks

6

Page 40: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z[T :V,N at 8]

8 9

· · · V N · · ·stocks

6

Page 41: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z j(z) [parameter used][T :V,N at 8] T:V,N

8 9

· · · V N · · ·stocks

6

Page 42: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z j(z) [parameter used][T :V,N at 8] T:V,N

8 9

· · · V N · · ·stocks

p(z | θ) =∏

z∈z θj(z) [likelihood]

6

Page 43: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z j(z) [parameter used][T :V,N at 8] T:V,N

8 9

· · · V N · · ·stocks

p(z | θ) =∏

z∈z θj(z) [likelihood]

p(z) =∫p(z | θ)p(θ)dθ [marginal likelihood]

6

Page 44: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z j(z) [parameter used][T :V,N at 8] T:V,N

8 9

· · · V N · · ·stocks

p(z | θ) =∏

z∈z θj(z) [likelihood]

p(z) =∫p(z | θ)p(θ)dθ [marginal likelihood]

Observations x: observed part of z (e.g., the words)

6

Page 45: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Formal setup

Parameters θ: vector of conditional probabilities

θT:V,N: from state V, probability of transitioning to state N

p(θ) is product of Dirichlets [prior]

Choices z: specifies values of latent and observed variables

z ∈ z j(z) [parameter used][T :V,N at 8] T:V,N

8 9

· · · V N · · ·stocks

p(z | θ) =∏

z∈z θj(z) [likelihood]

p(z) =∫p(z | θ)p(θ)dθ [marginal likelihood]

Observations x: observed part of z (e.g., the words)

Goal: sample from p(z | x) [not p(z | x,θ)]6

Page 46: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in z

7

Page 47: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

7

Page 48: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j

7

Page 49: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

7

Page 50: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n

7

Page 51: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

7

Page 52: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

z1

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

7

Page 53: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

z1 z2

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

7

Page 54: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

z1 z2

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

7

Page 55: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

z1 z2

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

z1 and z2 have same sufficient statistics

7

Page 56: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Exchangeability

Sufficient statistics n: # times parameters were used in znT:V,N: # times that state V transitioned to state N

Rewrite likelihood:

p(z | θ) =∏

j θnj

j = simple function of n and θ

p(z) = simple function of n (key: exchangeability)

z1 z2

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

z1 and z2 have same sufficient statistics ⇒ p(z1) = p(z2)

7

Page 57: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

8

Page 58: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

8

Page 59: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

· · · D V · · ·many stocks rose

8

Page 60: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D V · · ·many stocks rose

8

Page 61: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D V · · ·many stocks rose

· · · D V · · ·tech stocks were

· · · D V · · ·the stocks have

· · · D V · · ·how stocks from

8

Page 62: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D V V · · ·many stocks rose

· · · D V V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D N V · · ·how stocks from

Assignments[VVNN]

8

Page 63: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D V V · · ·the stocks have

· · · D N V · · ·how stocks from

Assignments[VVNN][VNVN]

8

Page 64: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D V V · · ·how stocks from

Assignments[VVNN][VNVN][VNNV]

8

Page 65: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

· · · D V V · · ·the stocks have

· · · D N V · · ·how stocks from

Assignments[VVNN][VNVN][VNNV][NVVN]

8

Page 66: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D N V · · ·many stocks rose

· · · D V V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D V V · · ·how stocks from

Assignments[VVNN][VNVN][VNNV][NVVN][NVNV]

8

Page 67: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D N V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D V V · · ·the stocks have

· · · D V V · · ·how stocks from

Assignments[VVNN][VNVN][VNNV][NVVN][NVNV][NNVV]

8

Page 68: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Types

Type of a variable: its dependent parameter components

For HMM, type = assignment to Markov blanket

type( ) = (D, stocks, V)

· · · D N V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D V V · · ·the stocks have

· · · D V V · · ·how stocks from

Assignments[VVNN][VNVN][VNNV][NVVN][NVNV][NNVV]

p(assignment) only depends on number of Vs and Ns8

Page 69: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sampling same-type variables

Goal: sample an assignment of a set of same-type variables

9

Page 70: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sampling same-type variables

Goal: sample an assignment of a set of same-type variables

m = 0 m = 1 m = 2 m = 3 m = 4p0 [VVVV] p1 [VVVN]

p1 [VVNV]p1 [VNVV]p1 [NVVV]

p2 [VVNN]p2 [VNVN]p2 [VNNV]p2 [NVVN]p2 [NVNV]p2 [NNVV]

p3 [VNNN]p3 [NVNN]p3 [NNVN]p3 [NNNV]

p4 [NNNN]

9

Page 71: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sampling same-type variables

Goal: sample an assignment of a set of same-type variables

Algorithm:

1. Choose m ∈ {0, . . . , B} with prob. ∝(

Bm

)pm

m = 0 m = 1 m = 2 m = 3 m = 4p0 [VVVV] p1 [VVVN]

p1 [VVNV]p1 [VNVV]p1 [NVVV]

p2 [VVNN]p2 [VNVN]p2 [VNNV]p2 [NVVN]p2 [NVNV]p2 [NNVV]

p3 [VNNN]p3 [NVNN]p3 [NNVN]p3 [NNNV]

p4 [NNNN]

9

Page 72: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sampling same-type variables

Goal: sample an assignment of a set of same-type variables

Algorithm:

1. Choose m ∈ {0, . . . , B} with prob. ∝(

Bm

)pm

2. Choose assignment uniformly from column m

m = 0 m = 1 m = 2 m = 3 m = 4p0 [VVVV] p1 [VVVN]

p1 [VVNV]p1 [VNVV]p1 [NVVV]

p2 [VVNN]p2 [VNVN]p2 [VNNV]p2 [NVVN]p2 [NVNV]p2 [NNVV]

p3 [VNNN]p3 [NVNN]p3 [NNVN]p3 [NNNV]

p4 [NNNN]

9

Page 73: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Full algorithm

Iterate: · · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D V V · · ·how stocks from

10

Page 74: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Full algorithm

Iterate:1. Choose a position

e.g., 2

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D V V · · ·how stocks from

10

Page 75: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Full algorithm

Iterate:1. Choose a position

e.g., 2

2. Add variables with same typee.g., (D, stocks, V)

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D N V · · ·the stocks have

· · · D V V · · ·how stocks from

10

Page 76: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Full algorithm

Iterate:1. Choose a position

e.g., 2

2. Add variables with same typee.g., (D, stocks, V)

3. Sample me.g., 1 ⇒ {V, V, V, N}

· · · D V · · ·many stocks rose

· · · D V · · ·tech stocks were

· · · D V · · ·the stocks have

· · · D V · · ·how stocks from

10

Page 77: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Full algorithm

Iterate:1. Choose a position

e.g., 2

2. Add variables with same typee.g., (D, stocks, V)

3. Sample me.g., 1 ⇒ {V, V, V, N}

4. Sample assignmente.g., [VNVV]

· · · D V V · · ·many stocks rose

· · · D N V · · ·tech stocks were

· · · D V V · · ·the stocks have

· · · D V V · · ·how stocks from

10

Page 78: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

5

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

11

Page 79: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

5

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

11

Page 80: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

5

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

7

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

2. Update conditioned on rest

11

Page 81: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

5

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

2. Update conditioned on rest

Sentence-based sampler (Sentence)1. Choose sentence

11

Page 82: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

3

tech

4

stocks

6

rose

1

in

2

heavy

7

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

2. Update conditioned on rest

Sentence-based sampler (Sentence)1. Choose sentence

2. Update conditioned on rest

11

Page 83: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

5

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

4

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

2. Update conditioned on rest

Sentence-based sampler (Sentence)1. Choose sentence

2. Update conditioned on rest

Type-based sampler (Type)1. Choose type

11

Page 84: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: sampling algorithms

2

tech

4

stocks

3

rose

1

in

8

heavy

2

trading

4

worker

6

demands

8

meeting

1

resistance

2

many

3

stocks

3

shot

8

up

6

today

5

investors

2

await

4

stocks

3

news

Token-based sampler (Token)1. Choose token

2. Update conditioned on rest

Sentence-based sampler (Sentence)1. Choose sentence

2. Update conditioned on rest

Type-based sampler (Type)1. Choose type

2. Update conditioned on rest

11

Page 85: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

12

Page 86: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

12

Page 87: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

12

Page 88: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

12

Page 89: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

12

Page 90: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

12

Page 91: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

12

Page 92: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

Dataset: CHILDES(9.7K sentences)

12

Page 93: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

Dataset: CHILDES(9.7K sentences)

Probabilistic tree-substitution grammar (PTSG) [Cohn et al., 2009]

12

Page 94: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

Dataset: CHILDES(9.7K sentences)

Probabilistic tree-substitution grammar (PTSG) [Cohn et al., 2009]

the

DT

sun

NN

NP

has

VBD

risen

VBN

VP

S

12

Page 95: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

Dataset: CHILDES(9.7K sentences)

Probabilistic tree-substitution grammar (PTSG) [Cohn et al., 2009]

the

DT

sun

NN

NP

has

VBD

risen

VBN

VP

S

12

Page 96: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Experimental setup: models/tasks/datasets

Hidden Markov model (HMM)

NN NNS VBG NNworker demands meeting resistance

Part-of-speech induction

Dataset: WSJ(49K sentences, 45 tags)

Unigram segmentation model (USM) [Goldwater et al., 2006]

l o o k a t t h e b o o k

Word segmentation

Dataset: CHILDES(9.7K sentences)

Probabilistic tree-substitution grammar (PTSG) [Cohn et al., 2009]

the

DT

sun

NN

NP

has

VBD

risen

VBN

VP

S

Dataset: WSJ(49K sentences, 45 tags)

12

Page 97: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

13

Page 98: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

Token

13

Page 99: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

Token

Type

13

Page 100: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

Token

Type

13

Page 101: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

3 6 9 12

time (hr.)

-6.1e6

-5.9e6

-5.7e6

-5.5e6

-5.3e6

log-

likel

ihood

Token

Type

13

Page 102: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Token versus Type

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

3 6 9 12

time (hr.)

-6.1e6

-5.9e6

-5.7e6

-5.5e6

-5.3e6

log-

likel

ihood

3 6 9 12

time (hr.)

35

40

45

50

55

60

tag

accu

racy

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Type

13

Page 103: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

14

Page 104: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

14

Page 105: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

V V V Vworker demands meeting resistance

14

Page 106: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

V V V Vworker demands meeting resistance

l o o k a t t h e b o o k

14

Page 107: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

(use many params.)η = 1

V V V Vworker demands meeting resistance

l o o k a t t h e b o o k

14

Page 108: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

(use many params.)η = 1

V V V Vworker demands meeting resistance

P N D Vworker demands meeting resistance

l o o k a t t h e b o o k

14

Page 109: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

(use few params.)η = 0

(use many params.)η = 1

V V V Vworker demands meeting resistance

P N D Vworker demands meeting resistance

l o o k a t t h e b o o k l o o k a t t h e b o o k

14

Page 110: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

15

Page 111: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

0.2 0.4 0.6 0.8 1.0

η

-7.2e6

-7.0e6

-6.8e6

-6.6e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

tag

accu

racy

Token

15

Page 112: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

0.2 0.4 0.6 0.8 1.0

η

-7.2e6

-7.0e6

-6.8e6

-6.6e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

tag

accu

racy

Token

Type

15

Page 113: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

0.2 0.4 0.6 0.8 1.0

η

-7.2e6

-7.0e6

-6.8e6

-6.6e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

-3.4e5

-3.1e5

-2.7e5

-2.3e5

-1.9e5

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

tag

accu

racy

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

wor

dto

ken

F1

Token

Type

15

Page 114: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

0.2 0.4 0.6 0.8 1.0

η

-7.2e6

-7.0e6

-6.8e6

-6.6e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

-3.4e5

-3.1e5

-2.7e5

-2.3e5

-1.9e5

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

-5.7e6

-5.6e6

-5.5e6

-5.4e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

tag

accu

racy

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

wor

dto

ken

F1

Token

Type

15

Page 115: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sensitivity to initialization

HMM USM PTSG

0.2 0.4 0.6 0.8 1.0

η

-7.2e6

-7.0e6

-6.8e6

-6.6e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

-3.4e5

-3.1e5

-2.7e5

-2.3e5

-1.9e5

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

-5.7e6

-5.6e6

-5.5e6

-5.4e6

log-

likel

ihood

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

tag

accu

racy

0.2 0.4 0.6 0.8 1.0

η

10

20

30

40

50

60

wor

dto

ken

F1

Token

Type

Type less sensitive than Token

15

Page 116: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

16

Page 117: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

• Annealing (Tokenanneal)

– Use p(z | x)1/T , temperature T from 5 to 1

17

Page 118: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Annealing

HMM USM PTSG

18

Page 119: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Annealing

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

3 6 9 12

time (hr.)

35

40

45

50

55

60

tag

accu

racy

Token

Type

18

Page 120: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Annealing

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

3 6 9 12

time (hr.)

35

40

45

50

55

60

tag

accu

racy

Token

Tokenanneal

Type

[anneal. hurts]

18

Page 121: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Annealing

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

ihood

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

3 6 9 12

time (hr.)

35

40

45

50

55

60

tag

accu

racy

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Tokenanneal

Type

[anneal. hurts] [anneal. helps]

18

Page 122: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Annealing

HMM USM PTSG

3 6 9 12

time (hr.)

-7.3e6

-6.9e6

-6.5e6

log-

likel

iho

od

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

iho

od

3 6 9 12

time (hr.)

-6.1e6

-5.9e6

-5.7e6

-5.5e6

-5.3e6

log-

likel

iho

od

3 6 9 12

time (hr.)

35

40

45

50

55

60

tag

accu

racy

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Tokenanneal

Type

[anneal. hurts] [anneal. helps] [no effect]

18

Page 123: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

• Annealing (Tokenanneal)

– Use p(z | x)1/T , temperature T from 5 to 1

– More (random) mobility through space, but insufficient

19

Page 124: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

• Annealing (Tokenanneal)

– Use p(z | x)1/T , temperature T from 5 to 1

– More (random) mobility through space, but insufficient

• Sentence-based (Sentence)

20

Page 125: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sentence

USM:

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Type

21

Page 126: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sentence

USM:

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Type

Sentence

21

Page 127: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sentence

USM:

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Type

Sentence

Sentence performs comparably to Token,

but worse than Type

21

Page 128: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Sentence

USM:

2 4 6 8

time (min.)

-2.1e5

-1.9e5

-1.7e5

log-

likel

ihood

2 4 6 8

time (min.)

35

40

45

50

55

60

wor

dto

ken

F1

Token

Type

Sentence

Sentence performs comparably to Token,

but worse than Type

Sentence requires dynamic programming,

computationally more expensive than Token and Type

21

Page 129: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

• Annealing (Tokenanneal)

– Use p(z | x)1/T , temperature T from 5 to 1

– More (random) mobility through space, but insufficient

• Sentence-based (Sentence)

– Sentence handles structural dependencies

– Type handles parameter dependencies

22

Page 130: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Alternatives

Can we get the gains of Type via simpler means?

• Annealing (Tokenanneal)

– Use p(z | x)1/T , temperature T from 5 to 1

– More (random) mobility through space, but insufficient

• Sentence-based (Sentence)

– Sentence handles structural dependencies

– Type handles parameter dependencies

– Intuition: parameter dependencies more important inunsupervised learning

23

Page 131: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

24

Page 132: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

24

Page 133: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structure

24

Page 134: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

24

Page 135: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

Tokens versus types:

24

Page 136: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

Tokens versus types:

• Older work operate on types (e.g., model merging)

24

Page 137: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

Tokens versus types:

• Older work operate on types (e.g., model merging)Larger updates, but greedy and brittle

24

Page 138: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

Tokens versus types:

• Older work operate on types (e.g., model merging)Larger updates, but greedy and brittle

• Recent methods operate on sentences or tokensSmaller updates, but softer and more robust

24

Page 139: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Summary and outlook

General strategy: update many dependent variables tractably

Variables sharing parameters very dependent

Type-based sampler updates exactly these variables

Techniques for tractability:

Old: exploit dynamic programming structureNew: think about exchangeability

Tokens versus types:

• Older work operate on types (e.g., model merging)Larger updates, but greedy and brittle

• Recent methods operate on sentences or tokensSmaller updates, but softer and more robust

Type-based sampling combines advantages of both

24

Page 140: Type-Based MCMCpliang/papers/type-naacl2010-talk.pdf · Type-Based MCMC NAACL 2010 { Los Angeles Percy Liang Michael I. Jordan Dan Klein. Type-Based MCMC NAACL 2010 { Los Angeles

Thank you!

25


Recommended