+ All Categories
Home > Documents > An Introduction to Topic Modeling - Verbs...

An Introduction to Topic Modeling - Verbs...

Date post: 29-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
56
An Introduction to Topic Modeling Daniel W. Peterson Department of Computer Science University of Colorado at Boulder [email protected] April 24, 2013 Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 1 / 20
Transcript
Page 1: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

An Introduction to Topic Modeling

Daniel W. Peterson

Department of Computer ScienceUniversity of Colorado at Boulder

[email protected]

April 24, 2013

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 1 / 20

Page 2: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Latent Semantic Analysis

Documents x Terms matrix: large and sparse

Use SVD to decompose it into three matrices

Keep only the “important” dimensions

Assumptions:

Word order doesn’t matterWords are orthogonal dimensions in a high-dimensional space

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 2 / 20

Page 3: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

Documents are generated by a probabilistic process

Structure based on topicsDifferent topics make different words more likely

Assumptions:

Word order doesn’t matterEach word is chosen as the result of exactly one topic

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 3 / 20

Page 4: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

N documents

A document is L words long

Each entry has an assignment toone of K topics

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 4 / 20

Page 5: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

How do we choose a topic?

We sample from a distributionover topics.

How do we choose a word?We sample from a distributionover words.

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20

Page 6: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

How do we choose a topic?We sample from a distributionover topics.

How do we choose a word?

We sample from a distributionover words.

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20

Page 7: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

How do we choose a topic?We sample from a distributionover topics.

How do we choose a word?We sample from a distributionover words.

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 5 / 20

Page 8: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Multinomial Distribution

Select one of several possible outcomes

Outcomes may be equally likely (like dice)

OR: some outcomes may be more likely thanothers (load the dice)

Looks like: a 1× n vector of probabilities

[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0

A sample looks like: a number

The outcome of rolling the diceProbability we get i is given by xi

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20

Page 9: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Multinomial Distribution

Select one of several possible outcomes

Outcomes may be equally likely (like dice)

OR: some outcomes may be more likely thanothers (load the dice)

Looks like: a 1× n vector of probabilities

[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0

A sample looks like: a number

The outcome of rolling the diceProbability we get i is given by xi

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20

Page 10: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Multinomial Distribution

Select one of several possible outcomes

Outcomes may be equally likely (like dice)

OR: some outcomes may be more likely thanothers (load the dice)

Looks like: a 1× n vector of probabilities

[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0

A sample looks like: a number

The outcome of rolling the diceProbability we get i is given by xi

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20

Page 11: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Multinomial Distribution

Select one of several possible outcomes

Outcomes may be equally likely (like dice)

OR: some outcomes may be more likely thanothers (load the dice)

Looks like: a 1× n vector of probabilities

[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0

A sample looks like: a number

The outcome of rolling the diceProbability we get i is given by xi

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20

Page 12: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Multinomial Distribution

Select one of several possible outcomes

Outcomes may be equally likely (like dice)

OR: some outcomes may be more likely thanothers (load the dice)

Looks like: a 1× n vector of probabilities

[x1, x2, . . . , xn]x1 + x2 + . . .+ xn = 1every xi > 0

A sample looks like: a number

The outcome of rolling the diceProbability we get i is given by xi

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 6 / 20

Page 13: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

θ is a distribution over topicsin a document

One θ for each document

θ is a 1× K vector

Sum of θ is 1

φ is a distribution over wordsin a topic

One φ for each topic

φ is a 1×W vector

Sum of φ is 1

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 7 / 20

Page 14: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

θ is a distribution over topicsin a document

One θ for each document

θ is a 1× K vector

Sum of θ is 1

φ is a distribution over wordsin a topic

One φ for each topic

φ is a 1×W vector

Sum of φ is 1

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 7 / 20

Page 15: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

Fold θ into graphicalmodel

Where do θ and φ comefrom?

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 8 / 20

Page 16: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Probabilistic Latent Semantic Analysis

Fold θ into graphicalmodel

Where do θ and φ comefrom?

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 8 / 20

Page 17: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from anappropriate distribution

Dirchlet: a distributionover distributions

Incorporating Dirichletprior provides smoothing

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20

Page 18: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from anappropriate distribution

Dirchlet: a distributionover distributions

Incorporating Dirichletprior provides smoothing

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20

Page 19: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from anappropriate distribution

Dirchlet: a distributionover distributions

Incorporating Dirichletprior provides smoothing

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 9 / 20

Page 20: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Dirichlet Distribution

Takes n parameters α1, α2, . . . , αn

Distribution over 1× n vectors with sum of 1

αi are called concentration parameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 10 / 20

Page 21: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Dirichlet Distribution with 2 Parameters

Figure: Image source: Wikipedia

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 11 / 20

Page 22: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Dirichlet Distribution with 3 Parameters

Figure: Image source: Yee Whye Teh

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 12 / 20

Page 23: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

A Sample from a Dirichlet

A particular 1× n vector with sum of 1

[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1

every xi > 0

A multinomial distribution

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20

Page 24: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

A Sample from a Dirichlet

A particular 1× n vector with sum of 1

[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1

every xi > 0

A multinomial distribution

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20

Page 25: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

A Sample from a Dirichlet

A particular 1× n vector with sum of 1

[x1, x2, . . . , xn] such that x1 + x2 + . . .+ xn = 1

every xi > 0

A multinomial distribution

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 13 / 20

Page 26: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from aDirichlet distribution

This is important forwhen we turn the modelaround:

Dirichlet distribution isconjugate prior ofmultinomial:

Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet

β and γ are smoothingparameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20

Page 27: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from aDirichlet distribution

This is important forwhen we turn the modelaround:

Dirichlet distribution isconjugate prior ofmultinomial:

Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet

β and γ are smoothingparameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20

Page 28: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Topic Modeling

Sample θ and φ from aDirichlet distribution

This is important forwhen we turn the modelaround:

Dirichlet distribution isconjugate prior ofmultinomial:

Given a Dirichlet prior,and counts of topicassignments, theposterior is also Dirichlet

β and γ are smoothingparameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 14 / 20

Page 29: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Inference

Generative model explains how the data was created

Inference: trying to guess model parameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 15 / 20

Page 30: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Inference

Generative model explains how the data was created

Inference: trying to guess model parameters

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 15 / 20

Page 31: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 32: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 33: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 34: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 35: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a time

Spend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 36: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areas

We can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 37: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhere

It doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 38: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling

Hard to determine most likely model parameters

Hard for even relatively likely parameters

Can’t sample from overall distribution: sample instead a singlevariable

Take a walk through distribution

One step (parameter) at a timeSpend more time walking around more likely areasWe can get to likely areas from anywhereIt doesn’t matter where we start!

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 16 / 20

Page 39: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 40: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 41: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and prior

Sample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 42: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and prior

Choose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 43: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 44: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 45: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Gibbs Sampling in a Topic Model

Start with randomassignment of topics

For each< word , document >pair:

Sample θ based oncounts and priorSample φ based oncounts and priorChoose k based on θ,φ, and w

Repeat the above manytimes

Smoothing (β and γ)very important

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 17 / 20

Page 46: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Bayes Rule

P(k|β,X) ∝ P(k|β)P(X|k)

Sampling from a conditional distribution can bebroken down into sampling based on the parentnodes (prior, β) and the children (likelihood, X)

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 18 / 20

Page 47: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Blocked Gibbs Sampling in a Topic Model

Start with randomassignment of topics

Repeat many times:

Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs

More sampling, lesscounting

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20

Page 48: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Blocked Gibbs Sampling in a Topic Model

Start with randomassignment of topics

Repeat many times:

Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs

More sampling, lesscounting

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20

Page 49: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Blocked Gibbs Sampling in a Topic Model

Start with randomassignment of topics

Repeat many times:

Sample all θ and φfrom counts and prior

Choose k for anumber of< word , document >pairs

More sampling, lesscounting

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20

Page 50: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Blocked Gibbs Sampling in a Topic Model

Start with randomassignment of topics

Repeat many times:

Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs

More sampling, lesscounting

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20

Page 51: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Blocked Gibbs Sampling in a Topic Model

Start with randomassignment of topics

Repeat many times:

Sample all θ and φfrom counts and priorChoose k for anumber of< word , document >pairs

More sampling, lesscounting

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 19 / 20

Page 52: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Collapsed Gibbs Sampling in a Topic Model

Integrate out θ and φ

Start with random assignment of topics

For each < word , document > pair:

Sample k directly from counts

Repeat many times

P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ

n(·)−i ,k + W γ

n(di )−i ,k + β

n(di )−i ,· + Kβ

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20

Page 53: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Collapsed Gibbs Sampling in a Topic Model

Integrate out θ and φ

Start with random assignment of topics

For each < word , document > pair:

Sample k directly from counts

Repeat many times

P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ

n(·)−i ,k + W γ

n(di )−i ,k + β

n(di )−i ,· + Kβ

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20

Page 54: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Collapsed Gibbs Sampling in a Topic Model

Integrate out θ and φ

Start with random assignment of topics

For each < word , document > pair:

Sample k directly from counts

Repeat many times

P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ

n(·)−i ,k + W γ

n(di )−i ,k + β

n(di )−i ,· + Kβ

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20

Page 55: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Collapsed Gibbs Sampling in a Topic Model

Integrate out θ and φ

Start with random assignment of topics

For each < word , document > pair:

Sample k directly from counts

Repeat many times

P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ

n(·)−i ,k + W γ

n(di )−i ,k + β

n(di )−i ,· + Kβ

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20

Page 56: An Introduction to Topic Modeling - Verbs Indexverbs.colorado.edu/~mpalmer/Ling7800/topic_modeling_slides.pdf · 4/24/2013  · Topic Modeling Sample and ˚from a Dirichlet distribution

Collapsed Gibbs Sampling in a Topic Model

Integrate out θ and φ

Start with random assignment of topics

For each < word , document > pair:

Sample k directly from counts

Repeat many times

P(zi = k |z−i ,w) ∝n(wi )−i ,k + γ

n(·)−i ,k + W γ

n(di )−i ,k + β

n(di )−i ,· + Kβ

Daniel Peterson (University of Colorado) Introduction to the HDP April 24, 2013 20 / 20


Recommended