+ All Categories
Home > Documents > Online Learning for Latent Dirichlet Allocation Matthew D. Hoffman, David M. Blei and Francis Bach...

Online Learning for Latent Dirichlet Allocation Matthew D. Hoffman, David M. Blei and Francis Bach...

Date post: 30-Dec-2015
Category:
Upload: brandon-warren
View: 222 times
Download: 2 times
Share this document with a friend
Popular Tags:
12
Online Learning for Latent Dirichlet Allocation Matthew D. Hoffman, David M. Blei and Francis Bach NIPS 2010 Presented by Lingbo Li
Transcript

Online Learning for Latent Dirichlet Allocation

Matthew D. Hoffman, David M. Blei and Francis Bach

NIPS 2010

Presented by Lingbo Li

Latent Dirichlet Allocation (LDA)

1) Draw each topic2) For each document:

1) Draw topic proportions2) For each word:

1) Draw2) Draw

Batch variational Bayes for LDA

For a collection of documents, infer:• Per-word topic assignment• Per-document topic proportion • topic distributions

True posterior is approximated by

Optimize over the variational parameters

Online variational inference for LDA

• Mini-batches:

• Hyperparameter estimation:

Analysis of convergence

Analysis of convergence

• Multiply the gradients by the inverse of an appropriate positive definite matrix H to speed up stochastic gradient algorithms.

• H: the Fisher information matrix of the variational distribution q

Experiments

Use perplexity on held-out data as a measure of model:

• are fit using the E step in algorithm 2;• •

• Two corpora: 352,549 documents from the journal Nature, and 100,000 documents from the English version Wikipedia.

• For each corpus, set aside a 1,000-document test set and a separate 1,000-document validation set.

• Run online LDA for five hours on the remaining documents from each corpus for

Evaluating learning parameters

Compare batch and online on fixed corpora:

True online


Recommended