+ All Categories
Home > Documents > Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum...

Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum...

Date post: 05-Jan-2016
Category:
Upload: dominic-cameron-morris
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
Bayesian models of cross- situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff for valuable discussion. Also thanks to Vikash Mansinghka, Ted Gibson, tedlab, and cocosci for comments and the Jacob Javits Foundation for funding.
Transcript
Page 1: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Bayesian models of cross-situational word learning

Michael C. FrankNoah Goodman

Josh Tenenbaum(MIT)

Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff for valuable discussion. Also thanks to Vikash Mansinghka, Ted Gibson, tedlab, and cocosci for

comments and the Jacob Javits Foundation for funding.

Page 2: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Word-learning in action

Page 3: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

The problem of word learning

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

words: “blue rings”objects: rings, big bird

words: “and green rings”objects: rings, big bird

words: “and yellow rings”objects: rings, big bird

words: “Bigbird! Do you want to hold the rings?”

objects: big bird

In any one situation, children hear many words and see many objects

Page 4: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

One possible solution

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Apply a cross-situational strategy to learn mappings(but this is harder than it looks)

Page 5: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

The problem of word learning

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

words: “blue rings”objects: rings, big bird

words: “and green rings”objects: rings, big bird

words: “and yellow rings”objects: rings, big bird

words: “Bigbird! Do you want to hold the rings?”

objects: big bird

Techniques for cross-situational word learning • Deductive inference: Siskind (1996)• Translation model: Yu, Ballard, & Aslin (2005), Yu &

Ballard (in press)

Page 6: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Outline

• Some facts of word learning– Mutual exclusivity– Fast-mapping– Use of social cues

• Our model: Bayesian word-learner

• Extension: Learning social cues

• Experimental coverage

• Some facts of word learning

• Our model: Bayesian word-learner

• Extension: Learning social cues

• Experimental coverage

Page 7: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Three facts of word learning

By 18-24 months, children will map a

novel word onto a novel referent (Markman

1992; Mervis & Bertrand, 1994)

Give me the dax!

Mutual exclusivity

Three- and four-year-olds can learn words

from one situation (Carey, 1978; Markson

& Bloom, 1997)

This one is a koba!

Fast mapping

By 18 months, children distinguish referents

from one another using social cues (Hollich,

Hirsh-Pasek, & Golinkoff, 2001)

Look at the modi!

Use of social cues

Page 8: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Outline

• The facts of word learning

• Our model: Bayesian word-learner– Model– Corpus– Comparison models– Results

• Extension: Learning social cues

• Experimental coverage

Page 9: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Generative model

O

W

lexicon

words

objects

I

things you intend to refer to

l

situations

unobserved

observed

observed

Page 10: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Generative model: example

situations

Wwords look pretty

objects O

Iintention

ball

lexicon

ball bike

l

Page 11: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Inference

Bayes’ rule

Parsimony prior on lexicons

Inference technique• Stochastic search with simulated tempering• Data-driven proposals drawn from the mutual

information of word-object pairings

Page 12: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Corpus

• 2x10 min clips from CHILDES-Rollins

• Interaction between mom and infant (~6mo)

• 2528 word tokens of 420 words in 623 sentences

• 24 objects, all toys

QuickTime™ and aPhoto - JPEG decompressor

are needed to see this picture.

Page 13: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Model comparison

• Co-occurrence frequency

• Point-wise mutual information

MI(W ,O) =p(W ,O)

p(W )p(O)

• Translation model, based on IBM model 1 (Yu & Ballard, in press)

Page 14: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Results: model comparison

precision

correct pairs in lexicon

total pairs in lexicon

recall

correct pairs in lexicon

total correct pairs

Page 15: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Results: intuitive analysis

Word Objectbaby book

bigbird birdbird rattle

birdie duckbook bookoink pighand handhat hat

meow kittymoocow cow

oink pigon ring

ring ringsheep sheep

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Most likely intentionsBest lexicon found

by search

Also: unlike baseline models, our model is extremely extensible

Page 16: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Outline

• The facts of word learning

• Our model: Bayesian word-learner

• Extension: Learning social cues– Corpus– Model– Preliminary results

• Experimental coverage

Page 17: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Social corpus coding

Coded social cues for each utterance: infant’s hands, eyes, mouth, and touch; mom’s

hands, eyes, and touch

Page 18: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

How it works

I’m looking

Mom looking

Ball 0 1

Bike 1 0

… … …

Bag 0 0

could be caused by base rate or by relevance

Noisy OR process

base rate relevance

Page 19: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Social model framework

S

social cues

r,b

relevance and base rate of social cues

O

W

lexicon

words

objects

I

things you intend to

refer to

l

situations

unobserved

Page 20: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Preliminary Results

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Model finds appropriate features

Social features allow finding intent in situations without referential words

Page 21: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Outline

• The facts of word learning

• Our model: Bayesian word-learner

• Extension: Learning social cues

• Experimental coverage– Mutual exclusivity– Fast-mapping– Use of social cues

Page 22: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Mutual exclusivity

model shows soft mutual exclusivity

Page 23: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Fast-mapping

model can fast-map: learn a word from a single instance

ruled out on account of “light syntax”: penalty for using a referring word in a non-referring way

Page 24: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Use of social cues

model can learn word meanings based on social cues alone

Page 25: Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum (MIT) Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff.

Conclusions

• Bayesian model of cross-situational word-learning– Performed best over a corpus– Allows parsing of sentences and interpretation

of speaker’s intent

• Social model– Model can learn which social cues are relevant

to reference

• Experimental coverage– Mutual exclusivity– Fast-mapping– Learning words for social cues


Recommended