+ All Categories
Home > Technology > lda2vec Text by the Bay 2016

lda2vec Text by the Bay 2016

Date post: 15-Apr-2017
Category:
Upload: christopher-moody
View: 1,597 times
Download: 0 times
Share this document with a friend
144
lda2vec (word2vec, and lda) Christopher Moody @ Stitch Fix
Transcript
Page 1: lda2vec Text by the Bay 2016

lda2vec (word2vec, and lda)

Christopher Moody @ Stitch Fix

Page 2: lda2vec Text by the Bay 2016

About

@chrisemoody Caltech Physics PhD. in astrostats supercomputing sklearn t-SNE contributor Data Labs at Stitch Fix github.com/cemoody

Gaussian Processes t-SNE

chainer deep learning

Tensor Decomposition

Page 3: lda2vec Text by the Bay 2016

word2vec

lda

1

23ld

a2vec

Page 4: lda2vec Text by the Bay 2016

1. king - man + woman = queen 2. Huge splash in NLP world 3. Learns from raw text 4. Pretty simple algorithm 5. Comes pretrained

word2vec

Page 5: lda2vec Text by the Bay 2016

1. Set up an objective function 2. Randomly initialize vectors 3. Do gradient descent

word2vec

Page 6: lda2vec Text by the Bay 2016

word

2vec

word2vec: learn word vector w from it’s surrounding context

w

Page 7: lda2vec Text by the Bay 2016

word

2vec

“The fox jumped over the lazy dog”Maximize the likelihood of seeing the words given the word over.

P(the|over) P(fox|over)

P(jumped|over) P(the|over) P(lazy|over) P(dog|over)

…instead of maximizing the likelihood of co-occurrence counts.

Page 8: lda2vec Text by the Bay 2016

word

2vec

P(fox|over)

What should this be?

Page 9: lda2vec Text by the Bay 2016

word

2vec

P(vfox|vover)

Should depend on the word vectors.

P(fox|over)

Page 10: lda2vec Text by the Bay 2016

word

2vec

“The fox jumped over the lazy dog”

P(w|c)

Extract pairs from context window around every input word.

Page 11: lda2vec Text by the Bay 2016

word

2vec

“The fox jumped over the lazy dog”

c

P(w|c)

Extract pairs from context window around every input word.

Page 12: lda2vec Text by the Bay 2016

word

2vec

“The fox jumped over the lazy dog”

w

P(w|c)

c

Extract pairs from context window around every input word.

Page 13: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

w c

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 14: lda2vec Text by the Bay 2016

word

2vec

“The fox jumped over the lazy dog”

P(w|c)

w c

Extract pairs from context window around every input word.

Page 15: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 16: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 17: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 18: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

w c

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 19: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 20: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 21: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

cw

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 22: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 23: lda2vec Text by the Bay 2016

word

2vec

P(w|c)

c w

“The fox jumped over the lazy dog”

Extract pairs from context window around every input word.

Page 24: lda2vec Text by the Bay 2016

objectiv

e

Measure loss between w and c?

How should we define P(w|c)?

Page 25: lda2vec Text by the Bay 2016

objectiv

e

w . c

How should we define P(w|c)?

Measure loss between w and c?

Page 26: lda2vec Text by the Bay 2016

word

2vec

w . c ~ 1

objectiv

e

w

c

vcanada . vsnow ~ 1

Page 27: lda2vec Text by the Bay 2016

word

2vec

w . c ~ 0

objectiv

e

w

cvcanada . vdesert ~0

Page 28: lda2vec Text by the Bay 2016

word

2vec

w . c ~ -1

objectiv

e

w

c

Page 29: lda2vec Text by the Bay 2016

word

2vec

w . c ∈ [-1,1]

objectiv

e

Page 30: lda2vec Text by the Bay 2016

word

2vec

But we’d like to measure a probability.

w . c ∈ [-1,1]

objectiv

e

Page 31: lda2vec Text by the Bay 2016

word

2vec

But we’d like to measure a probability.

objectiv

e

∈ [0,1]σ(c·w)

Page 32: lda2vec Text by the Bay 2016

word

2vec

But we’d like to measure a probability.

objectiv

e

∈ [0,1]σ(c·w)

w c

w c

SimilarDissimilar

Page 33: lda2vec Text by the Bay 2016

word

2vec

Loss function:

objectiv

e

L=σ(c·w)

Logistic (binary) choice. Is the (context, word) combination from our dataset?

Page 34: lda2vec Text by the Bay 2016

word

2vec

The skip-gram negative-sampling model

objectiv

e

Trivial solution is that context = word for all vectors

L=σ(c·w)w

c

Page 35: lda2vec Text by the Bay 2016

word

2vec

The skip-gram negative-sampling model

L = σ(c·w) + σ(-c·wneg)

objectiv

e

Draw random words in vocabulary.

Page 36: lda2vec Text by the Bay 2016

word

2vec

The skip-gram negative-sampling model

objectiv

e

Discriminate positive from negative samples

Multiple Negative

L = σ(c·w) + σ(-c·wneg) +…+ σ(-c·wneg)

Page 37: lda2vec Text by the Bay 2016

word

2vec

The SGNS ModelPMI

ci·wj = PMI(Mij) - log k

…is extremely similar to matrix factorization!

Levy & Goldberg 2014

L = σ(c·w) + σ(-c·wneg)

Page 38: lda2vec Text by the Bay 2016

word

2vec

The SGNS ModelPMI

Levy & Goldberg 2014

‘traditional’ NLP

L = σ(c·w) + σ(-c·wneg)

ci·wj = PMI(Mij) - log k

…is extremely similar to matrix factorization!

Page 39: lda2vec Text by the Bay 2016

word

2vec

The SGNS Model

L = σ(c·w) + Σσ(-c·w)

PMI

ci·wj = log

Levy & Goldberg 2014

#(ci,wj)/n

k #(wj)/n #(ci)/n

‘traditional’ NLP

Page 40: lda2vec Text by the Bay 2016

word

2vec

The SGNS Model

L = σ(c·w) + Σσ(-c·w)

PMI

ci·wj = log

Levy & Goldberg 2014

popularity of c,wk (popularity of c) (popularity of w)

‘traditional’ NLP

Page 41: lda2vec Text by the Bay 2016

word

2vec

PMI

99% of word2vec is counting.

And you can count words in SQL

Page 42: lda2vec Text by the Bay 2016

word

2vec

PMI

Count how many times you saw c·w

Count how many times you saw c

Count how many times you saw w

Page 43: lda2vec Text by the Bay 2016

word

2vec

PMI

…and this takes ~5 minutes to compute on a single core. Computing SVD is a completely standard math library.

Page 44: lda2vec Text by the Bay 2016

word2vec

Page 45: lda2vec Text by the Bay 2016
Page 46: lda2vec Text by the Bay 2016
Page 47: lda2vec Text by the Bay 2016
Page 48: lda2vec Text by the Bay 2016
Page 49: lda2vec Text by the Bay 2016
Page 50: lda2vec Text by the Bay 2016
Page 51: lda2vec Text by the Bay 2016
Page 52: lda2vec Text by the Bay 2016
Page 53: lda2vec Text by the Bay 2016
Page 54: lda2vec Text by the Bay 2016
Page 55: lda2vec Text by the Bay 2016
Page 56: lda2vec Text by the Bay 2016
Page 57: lda2vec Text by the Bay 2016
Page 58: lda2vec Text by the Bay 2016
Page 59: lda2vec Text by the Bay 2016
Page 60: lda2vec Text by the Bay 2016
Page 61: lda2vec Text by the Bay 2016

ITEM_3469 + ‘Pregnant’

Page 62: lda2vec Text by the Bay 2016

+ ‘Pregnant’

Page 63: lda2vec Text by the Bay 2016

= ITEM_701333 = ITEM_901004 = ITEM_800456

Page 64: lda2vec Text by the Bay 2016
Page 65: lda2vec Text by the Bay 2016

what about?LDA?

Page 66: lda2vec Text by the Bay 2016

LDA on Client Item Descriptions

Page 67: lda2vec Text by the Bay 2016

LDA on Item

Descriptions (with Jay)

Page 68: lda2vec Text by the Bay 2016

LDA on Item

Descriptions (with Jay)

Page 69: lda2vec Text by the Bay 2016

LDA on Item

Descriptions (with Jay)

Page 70: lda2vec Text by the Bay 2016

lda vs word2vec

Page 71: lda2vec Text by the Bay 2016

Bayesian Graphical ModelML Neural Model

Page 72: lda2vec Text by the Bay 2016

word2vec is local: one word predicts a nearby word

“I love finding new designer brands for jeans”

Page 73: lda2vec Text by the Bay 2016

“I love finding new designer brands for jeans”

But text is usually organized.

Page 74: lda2vec Text by the Bay 2016

“I love finding new designer brands for jeans”

But text is usually organized.

Page 75: lda2vec Text by the Bay 2016

“I love finding new designer brands for jeans”

In LDA, documents globally predict words.

doc 7681

Page 76: lda2vec Text by the Bay 2016

typical word2vec vector

[ 0%, 9%, 78%, 11%]

typical LDA document vector

[ -0.75, -1.25, -0.55, -0.12, +2.2]

All sum to 100%All real values

Page 77: lda2vec Text by the Bay 2016

5D word2vec vector

[ 0%, 9%, 78%, 11%]

5D LDA document vector

[ -0.75, -1.25, -0.55, -0.12, +2.2]

Sparse All sum to 100%

Dimensions are absolute

Dense All real values

Dimensions relative

Page 78: lda2vec Text by the Bay 2016

100D word2vec vector

[ 0%0%0%0%0% … 0%, 9%, 78%, 11%]

100D LDA document vector

[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2]

Sparse All sum to 100%

Dimensions are absolute

Dense All real values

Dimensions relative

dense sparse

Page 79: lda2vec Text by the Bay 2016

100D word2vec vector

[ 0%0%0%0%0% … 0%, 9%, 78%, 11%]

100D LDA document vector

[ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2]

Similar in fewer ways (more interpretable)

Similar in 100D ways (very flexible)

+mixture +sparse

Page 80: lda2vec Text by the Bay 2016

can we do both? lda2vec

Page 81: lda2vec Text by the Bay 2016

-1.9 0.85 -0.6 -0.3 -0.5

Lufthansa is a German airline and when

fox

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Lufthansa is a German airline and when

German

word2vec predicts locally: one word predicts a nearby word

Page 82: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

German

Document vector predicts a word from

a global context

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Page 83: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

We’re missing mixtures & sparsity!

German

Page 84: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

We’re missing mixtures & sparsity!

Page 85: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Now it’s a mixture.

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Page 86: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Trinitarian baptismal

Pentecostals Bede

schismatics excommunication

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

Page 87: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

topic 1 = “religion” Trinitarian baptismal

Pentecostals Bede

schismatics excommunication

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

Page 88: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Milosevic absentee

Indonesia Lebanese Isrealis

Karadzic

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

Page 89: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

topic 2 = “politics” Milosevic absentee

Indonesia Lebanese Isrealis

Karadzic

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

#topicsDocument weight

Page 90: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Page 91: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Page 92: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Page 93: lda2vec Text by the Bay 2016

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

Sparsity!

0.34 -0.1 0.17

41% 26% 34%

-1.4 -0.5 -1.4

-1.9-1.7 0.75

0.96-0.7 -1.9

-0.2-1.1 0.6

-0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5

-2.6 0.45 -1.3 -0.6 -0.8

Lufthansa is a German airline and when

#topics

#topicsfox

#hidden units

#topics

#hidden units#hidden units

#hidden units

Skip grams from sentences

Word vector

Negative sampling loss

Topic matrix

Document proportion

Document weight

Document vector

Context vector

x

+

Lufthansa is a German airline and when

34% 32% 34%

t=0

41% 26% 34%

t=10

99% 1% 0%

t=∞

time

Page 94: lda2vec Text by the Bay 2016

@chrisemoody

lda2vec.com

Page 95: lda2vec Text by the Bay 2016

+ API docs + Examples + GPU + Tests

@chrisemoody

lda2vec.com

Page 96: lda2vec Text by the Bay 2016

@chrisemoody Example Hacker News comments

Topics: http://nbviewer.jupyter.org/github/cemoody/lda2vec/blob/master/examples/

hacker_news/lda2vec/lda2vec.ipynb

Word vectors: https://github.com/cemoody/

lda2vec/blob/master/examples/hacker_news/lda2vec/

word_vectors.ipynb

Page 97: lda2vec Text by the Bay 2016

@chrisemoody

lda2vec.com

human-interpretable doc topics, use LDA.

machine-useable word-level features, use word2vec.

if you like to experiment a lot, and have topics over user / doc / region / etc. features, use lda2vec. (and you have a GPU)

If you want…

Page 98: lda2vec Text by the Bay 2016

?@chrisemoody

Multithreaded Stitch Fix

Page 99: lda2vec Text by the Bay 2016

@chrisemoody

lda2vec.com

Page 101: lda2vec Text by the Bay 2016

“PS! Thank you for such an awesome idea”

@chrisemoody

doc_id=1846

Can we model topics to sentences? lda2lstm

Page 102: lda2vec Text by the Bay 2016

Can we model topics to sentences? lda2lstm

“PS! Thank you for such an awesome idea”doc_id=1846

@chrisemoody

Can we model topics to images? lda2ae

TJ Torres

Page 103: lda2vec Text by the Bay 2016

and now for something completely crazy4Fun Stuff

Page 104: lda2vec Text by the Bay 2016

translation

(using just a rotation matrix)

Miko

lov

2013

English

Spanish

Matrix Rotation

Page 105: lda2vec Text by the Bay 2016

deepwalk

Perozz

i

et al 2

014

learn word vectors from sentences

“The fox jumped over the lazy dog”

vOUT vOUT vOUT vOUTvOUTvOUT

‘words’ are graph vertices ‘sentences’ are random walks on the graph

word2vec

Page 106: lda2vec Text by the Bay 2016

Playlists at Spotify

context

sequence

lear

ning

‘words’ are song indices ‘sentences’ are playlists

Page 107: lda2vec Text by the Bay 2016

Playlists at Spotify

contextErik

Bernhar

dsson

Great performance on ‘related artists’

Page 108: lda2vec Text by the Bay 2016

Fixes at Stitch Fix

sequence

lear

ning

Let’s try: ‘words’ are items ‘sentences’ are fixes

Page 109: lda2vec Text by the Bay 2016

Fixes at Stitch Fix

context

Learn similarity between styles because they co-occur

Learn ‘coherent’ styles

sequence

lear

ning

Page 110: lda2vec Text by the Bay 2016

Fixes at Stitch Fix?

context

sequence

lear

ningGot lots of structure!

Page 111: lda2vec Text by the Bay 2016

Fixes at Stitch Fix?

context

sequence

lear

ning

Page 112: lda2vec Text by the Bay 2016

Fixes at Stitch Fix?

context

sequence

lear

ning

Nearby regions are consistent ‘closets’

Page 113: lda2vec Text by the Bay 2016

?@chrisemoody

Multithreaded Stitch Fix

Page 114: lda2vec Text by the Bay 2016

context dependent

Levy

& G

oldberg

2014

Australian scientist discovers star with telescopecontext +/- 2 words

Page 115: lda2vec Text by the Bay 2016

context dependent

context

Australian scientist discovers star with telescope

Levy

& G

oldberg

2014

Page 116: lda2vec Text by the Bay 2016

context dependent

context

Australian scientist discovers star with telescopecontext

Levy

& G

oldberg

2014

Page 117: lda2vec Text by the Bay 2016

context dependent

context

BoW DEPS

topically-similar vs ‘functionally’ similar

Levy

& G

oldberg

2014

Page 118: lda2vec Text by the Bay 2016

?@chrisemoody

Multithreaded Stitch Fix

Page 119: lda2vec Text by the Bay 2016
Page 120: lda2vec Text by the Bay 2016

Crazy Approaches

Paragraph Vectors (Just extend the context window)

Content dependency (Change the window grammatically)

Social word2vec (deepwalk) (Sentence is a walk on the graph)

Spotify (Sentence is a playlist of song_ids)

Stitch Fix (Sentence is a shipment of five items)

Page 121: lda2vec Text by the Bay 2016
Page 122: lda2vec Text by the Bay 2016

CBOW

“The fox jumped over the lazy dog”

Guess the word given the context

~20x faster. (this is the alternative.)

vOUT

vIN vINvIN vINvIN vIN

SkipGram

“The fox jumped over the lazy dog”

vOUT vOUT

vIN

vOUT vOUT vOUTvOUT

Guess the context given the word

Better at syntax. (this is the one we went over)

Page 123: lda2vec Text by the Bay 2016

lda2

vec

vDOC = a vtopic1 + b vtopic2 +…

Let’s make vDOC sparse

Page 124: lda2vec Text by the Bay 2016

lda2

vec

This works! 😀 But vDOC isn’t as interpretable as the topic vectors. 😔

vDOC = topic0 + topic1

Let’s say that vDOC ads

Page 125: lda2vec Text by the Bay 2016

lda2

vec

softmax(vOUT * (vIN+ vDOC))

Page 126: lda2vec Text by the Bay 2016
Page 127: lda2vec Text by the Bay 2016

theory of lda2vec

lda2

vec

Page 128: lda2vec Text by the Bay 2016

pyLDAvis of lda2vec

lda2

vec

Page 129: lda2vec Text by the Bay 2016

LDA Results

context

History

I loved every choice in this fix!! Great job!

Great Stylist Perfect

Page 130: lda2vec Text by the Bay 2016

LDA Results

context

History

Body Fit

My measurements are 36-28-32. If that helps. I like wearing some clothing that is fitted.

Very hard for me to find pants that fit right.

Page 131: lda2vec Text by the Bay 2016

LDA Results

context

History

Sizing

Really enjoyed the experience and the pieces, sizing for tops was too big.

Looking forward to my next box!

Excited for next

Page 132: lda2vec Text by the Bay 2016

LDA Results

context

History

Almost Bought

It was a great fix. Loved the two items I kept and the three I sent back were close!

Perfect

Page 133: lda2vec Text by the Bay 2016

All of the following ideas will change what ‘words’ and ‘context’ represent.

Page 134: lda2vec Text by the Bay 2016

parag

raph

vecto

r

What about summarizing documents?

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

Page 135: lda2vec Text by the Bay 2016

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

The framework nuclear agreement he reached with Iran on Thursday did not provide the definitive answer to whether Mr. Obama’s audacious gamble will pay off. The fist Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.

parag

raph

vecto

r

Normal skipgram extends C words before, and C words after.

IN

OUT OUT

Page 136: lda2vec Text by the Bay 2016

On the day he took office, President Obama reached out to America’s enemies, offering in his first inaugural address to extend a hand if you are willing to unclench your fist. More than six years later, he has arrived at a moment of truth in testing that

The framework nuclear agreement he reached with Iran on Thursday did not provide the definitive answer to whether Mr. Obama’s audacious gamble will pay off. The fist Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed.

parag

raph

vecto

r

A document vector simply extends the context to the whole document.

IN

OUT OUT

OUT OUTdoc_1347

Page 137: lda2vec Text by the Bay 2016

fromgensim.modelsimportDoc2Vecfn=“item_document_vectors”model=Doc2Vec.load(fn)model.most_similar('pregnant')matches=list(filter(lambdax:'SENT_'inx[0],matches))

#['...Iamcurrently23weekspregnant...',#'...I'mnow10weekspregnant...',#'...notshowingtoomuchyet...',#'...15weeksnow.Babybump...',#'...6weekspostpartum!...',#'...12weekspostpartumandamnursing...',#'...Ihavemybabyshowerthat...',#'...amstillbreastfeeding...',#'...Iwouldloveanoutfitforababyshower...']

sente

nce

sear

ch

Page 138: lda2vec Text by the Bay 2016
Page 139: lda2vec Text by the Bay 2016
Page 140: lda2vec Text by the Bay 2016
Page 141: lda2vec Text by the Bay 2016
Page 142: lda2vec Text by the Bay 2016
Page 143: lda2vec Text by the Bay 2016
Page 144: lda2vec Text by the Bay 2016

Recommended