+ All Categories
Home > Documents > Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features...

Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features...

Date post: 05-Jul-2020
Category:
Upload: others
View: 27 times
Download: 0 times
Share this document with a friend
88
CS 6956: Deep Learning for NLP Convolutional Neural Networks for Language
Transcript
Page 1: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

CS6956:DeepLearningforNLP

ConvolutionalNeuralNetworksforLanguage

Page 2: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Featuresfromtext

Example:Sentimentclassification

Thegoal:Isthesentimentofasentencepositive,negativeorneutral?

Thefilmisfunandishosttosometrulyexcellentsequences

Approach:TrainamulticlassclassifierWhatfeatures?

2

Page 3: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Featuresfromtext

Example:Sentimentclassification

Thegoal:Isthesentimentofasentencepositive,negativeorneutral?

Thefilmis funandishosttosome trulyexcellentsequences

Approach:TrainamulticlassclassifierWhatfeatures?Somewordsandngrams areinformative,whilesomearenot

3

Page 4: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Featuresfromtext

Example:Sentimentclassification

Thegoal:Isthesentimentofasentencepositive,negativeorneutral?

Thefilmis funandishosttosome trulyexcellentsequences

Approach:TrainamulticlassclassifierWhatfeatures?Somewordsandngrams areinformative,whilesomearenot

Weneedto:1. Identifyinformativelocalinformation2. Aggregateitintoafixedsizevectorrepresentation

4

Page 5: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks

Designedto1. Identifylocalpredictorsinalargerinput

2. Poolthemtogethertocreateafeaturerepresentation

3. Andpossiblyrepeatthisinahierarchicalfashion

IntheNLPcontext,ithelpsidentifypredictivengrams foratask

5

Page 6: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

6

Page 7: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

7

Page 8: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• HubelandWiesel,1950s/60s:Mammalianvisualcortexcontainneuronsthatrespondtosmallregionsandspecificpatternsinthevisualfield

• Fukushima1980,Neocognitron:DirectlyinspiredbyHubel,Wiesel– Keyidea:localityoffeaturesinthevisualcortexisimportant,integratethemlocallyand

propagatethemtofurtherlayers– Twooperations:convolutionallayerthatreactstospecificpatternsandadown-sampling

layerthataggregatesinformation

• LeCun 1989-today,ConvolutionalNeuralNetwork:Asupervisedversion– Relatedtoconvolutionkernelsincomputervision– Verysuccessfulonhandwritingrecognitionandothercomputervisiontasks

• Hasbecomebetteroverrecentyearswithmoredata,computation– Krizhevsky etal2012:ObjectdetectionwithImageNet– Thedefactofeatureextractorforcomputervision

8

Firstaroseinthecontextofvision

Page 9: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• HubelandWiesel,1950s/60s:Mammalianvisualcortexcontainneuronsthatrespondtosmallregionsandspecificpatternsinthevisualfield

9

Firstaroseinthecontextofvision

NobelPrizeinPhysiologyorMedicine,1981

DavidH.Hubel Torsten Wiesel

Page 10: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• HubelandWiesel,1950s/60s:Mammalianvisualcortexcontainneuronsthatrespondtosmallregionsandspecificpatternsinthevisualfield

• Fukushima1980,Neocognitron:DirectlyinspiredbyHubel,Wiesel– Keyidea:localityoffeaturesinthevisualcortexisimportant,integratethemlocallyand

propagatethemtofurtherlayers– Twooperations

1. convolutionallayerthatreactstospecificpatternsand,2. adown-samplinglayerthataggregatesinformation

10

Firstaroseinthecontextofvision

Page 11: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• HubelandWiesel,1950s/60s:Mammalianvisualcortexcontainneuronsthatrespondtosmallregionsandspecificpatternsinthevisualfield

• Fukushima1980,Neocognitron:DirectlyinspiredbyHubel,Wiesel– Keyidea:localityoffeaturesinthevisualcortexisimportant,integratethemlocallyand

propagatethemtofurtherlayers– Twooperations:convolutionallayerthatreactstospecificpatternsandadown-sampling

layerthataggregatesinformation

• LeCun 1989-today,ConvolutionalNeuralNetwork:Asupervisedversion– Relatedtoconvolutionkernelsincomputervision– Successwithhandwritingrecognitionandothercomputervisiontasks

11

Firstaroseinthecontextofvision

Page 12: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• HubelandWiesel,1950s/60s:Mammalianvisualcortexcontainneuronsthatrespondtosmallregionsandspecificpatternsinthevisualfield

• Fukushima1980,Neocognitron:DirectlyinspiredbyHubel,Wiesel– Keyidea:localityoffeaturesinthevisualcortexisimportant,integratethemlocallyand

propagatethemtofurtherlayers– Twooperations:convolutionallayerthatreactstospecificpatternsandadown-sampling

layerthataggregatesinformation

• LeCun 1989-today,ConvolutionalNeuralNetwork:Asupervisedversion– Relatedtoconvolutionkernelsincomputervision– Successwithhandwritingrecognitionandothercomputervisiontasks

• Hasbecomebetteroverrecentyearswithmoredata,computation– Krizhevsky etal2012:ObjectdetectionwithImageNet– Thedefactofeatureextractorforcomputervision

12

Firstaroseinthecontextofvision

Page 13: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworks:Briefhistory

• IntroducedtoNLPbyCollobert etal,2011– Usedasafeatureextractionsystemforsemanticrolelabeling

• Sincethenseveralotherapplicationssuchassentimentanalysis,questionclassification,etc– Kalchbrener etal2014,Kim2014

13

Page 14: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

CNNterminology

• Filter– Afunctionthattransformsininputmatrix/vectorintoascalarfeature– Afilterisalearnedfeaturedetector

• Channel– Incomputervision,colorimageshavered,blueandgreenchannels– Ingeneral,achannelrepresentsamediumthatcapturesinformation

aboutaninputindependentofotherchannels• Forexample,differentkindsofwordembeddings couldbedifferentchannels• Channelscouldthemselvesbeproducedbypreviousconvolutionallayers

• Receptivefield– Theregionoftheinputthatafiltercurrentlyfocuseson

14

Showsitscomputervisionsandsignalprocessingorigins

Page 15: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

CNNterminology

• Filter– Afunctionthattransformsininputmatrix/vectorintoascalarfeature– Afilterisalearnedfeaturedetector(alsocalledafeaturemap)

• Channel– Incomputervision,colorimageshavered,blueandgreenchannels– Ingeneral,achannelrepresentsamediumthatcapturesinformation

aboutaninputindependentofotherchannels• Forexample,differentkindsofwordembeddings couldbedifferentchannels• Channelscouldthemselvesbeproducedbypreviousconvolutionallayers

• Receptivefield– Theregionoftheinputthatafiltercurrentlyfocuseson

15

Showsitscomputervisionsandsignalprocessingorigins

Page 16: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

CNNterminology

• Filter– Afunctionthattransformsininputmatrix/vectorintoascalarfeature– Afilterisalearnedfeaturedetector(alsocalledafeaturemap)

• Channel– Incomputervision,colorimageshavered,blueandgreenchannels– Ingeneral,achannelrepresentsamediumthatcapturesinformation

aboutaninputindependentofotherchannels• Forexample,differentkindsofwordembeddings couldbedifferentchannels• Channelscouldthemselvesbeproducedbypreviousconvolutionallayers

• Receptivefield– Theregionoftheinputthatafiltercurrentlyfocuseson

16

Showsitscomputervisionsandsignalprocessingorigins

Page 17: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

CNNterminology

• Filter– Afunctionthattransformsininputmatrix/vectorintoascalarfeature– Afilterisalearnedfeaturedetector(alsocalledafeaturemap)

• Channel– Incomputervision,colorimageshavered,blueandgreenchannels– Ingeneral,achannelrepresentsa“viewoftheinput”thatcaptures

informationaboutaninputindependentofotherchannels• Forexample,differentkindsofwordembeddings couldbedifferentchannels• Channelscouldthemselvesbeproducedbypreviousconvolutionallayers

• Receptivefield– Theregionoftheinputthatafiltercurrentlyfocuseson

17

Showsitscomputervisionsandsignalprocessingorigins

Page 18: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

18

Page 19: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

19

Let’sseethisusinganexampleforvectors.

Wewillgeneralizethistomatricesandbeyond,butthegeneralidearemainsthesame.

Page 20: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

20

Anexampleusingvectors

2 3 1 3 2 1Avector𝐱

Page 21: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

21

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter 𝐟 ofsize𝑛

Anexampleusingvectors

Here,thefiltersizeis3

Page 22: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

22

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Theoutput isalsoavector

Anexampleusingvectors

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Page 23: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

23

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Theoutput isalsoavector

Anexampleusingvectors

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Thefiltermovesacrossthevector.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthevectorofthatsize.

Page 24: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

24

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

0

Paddingatthebeginning

Page 25: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

25

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7Theoutput isalsoavector

0

Paddingatthebeginning

Page 26: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

26

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9Theoutput isalsoavector

Page 27: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

27

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8Theoutput isalsoavector

Page 28: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

28

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9Theoutput isalsoavector

Page 29: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

29

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8Theoutput isalsoavector

Page 30: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

30

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

0

Paddingattheend

Page 31: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

31

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Page 32: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

32

2 3 1 3 2 1

1 2 1

output( =*𝑓, ⋅ 𝑥(/ 01 2,

,

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thefiltermovesacrossthevector.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthevectorofthatsize.

Page 33: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

33

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrix ofthatsize.

Page 34: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

34

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 35: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

35

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 36: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

36

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 37: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

37

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 38: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

38

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 39: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

39

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 40: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

40

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Andsoon…Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 41: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

41

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 42: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

42

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Andsoon…Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 43: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatisaconvolution?

43

Thesameideaappliestomatricesaswell

Aninputmatrix Afilter Theresultofconvolution

Thefiltermovesacrossthematrix.

Ateachposition,theoutputisthedotproductofthefilterwithasliceofthematrixofthatsize.

Page 44: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

44

Page 45: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Pooling:Anaggregationoperation

• Aconvolutionproducesavector/matrixthatcapturespropertiesofeachwindow

• Poolingcombinesthisinformationtoproduceadown-sampledversionvector/matrix– Typicallyusingthemaximumortheaveragevaluewithinawindow

• Intuition– Afilterisafeaturedetectorthatdiscovershowwelleachwindow

matchesafeatureofinterest– Themostimportantfeaturesshouldberecognizedregardlessoftheir

location– Answer:Pooltheinformationfromdifferentwindowstogether

45

Page 46: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

46

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example1:Maxpoolingwithwindowsize3

Page 47: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

47

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

9

Example1:Maxpoolingwithwindowsize3

Page 48: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

48

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

9 9

Example1:Maxpoolingwithwindowsize3

Page 49: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

49

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

9 9 9

Example1:Maxpoolingwithwindowsize3

Page 50: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

50

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

9 9 9 8

Example1:Maxpoolingwithwindowsize3

Page 51: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

51

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example2:Averagepoolingwithwindowsize3

Page 52: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

52

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example2:Averagepoolingwithwindowsize3

8

Page 53: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

53

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

8 8.6

Example2:Averagepoolingwithwindowsize3

Page 54: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

54

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example2:Averagepoolingwithwindowsize3

8 8.6 8.3

Page 55: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

55

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example2:Averagepoolingwithwindowsize3

8 8.6 8.3 7

Page 56: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

56

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

Example3:Maxpoolingwithwindowsize=lengthofthevector

9

Page 57: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Whatispooling?

57

2 3 1 3 2 1

1 2 1

Avector𝐱

Filter𝐟 ofsize𝑛

Anexampleusingvectors

7 9 8 9 8 4Theoutput isalsoavector

Thepoolingoperationcanbeappliedusingawindowaswell

ImportantnoteTherearenolearnedparametersforthepoolingoperation.Itisadeterministicoperation.

Page 58: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Typicalkindsofpooling

• Maxpooling– Takethemaximumvalueoftheresultsoftheconvolution

• Averagepooling– Usesaveragetopoolinsteadofmax

• K-maxpooling– TakethetopKvalues(forafixedk)– Generalizationofmaxpooling

58

Page 59: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

59

Page 60: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

Thiscouldbeextendedtogeneraltensorsaswell

60

Page 61: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

• Thewindowsizedefinesthereceptivefield– Wewillrefertothewindowasx5

61

Page 62: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

• Thewindowsizedefinesthereceptivefield– Wewillrefertothewindowasx5

• Afilterisdefinedbysomeparameters(thatwillbelearned)– Ingeneral,amatrixu ofthesameshapeasathewindowandabiasb

62

Page 63: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

• Thewindowsizedefinesthereceptivefield– Wewillrefertothewindowasx5

• Afilterisdefinedbysomeparameters(thatwillbelearned)– Ingeneral,amatrixu ofthesameshapeasathewindowandabiasb

• Convolution:Iterateoverallwindowsandapplythefilter– Typicallyhasanon-linearity(e.g.ReLU)

𝑝( = 𝑔(𝑢 ⋅ 𝑥( + 𝑏)

63

Page 64: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

• Thewindowsizedefinesthereceptivefield– Wewillrefertothewindowasx5

• Afilterisdefinedbysomeparameters(thatwillbelearned)– Ingeneral,amatrixu ofthesameshapeasathewindowandabiasb

• Convolution:Iterateoverallwindowsandapplythefilter– Typicallyhasanon-linearity(e.g.ReLU)

𝑝( = 𝑔(𝑢 ⋅ 𝑥( + 𝑏)

• Pooling:Aggregatethe𝑝(’sintoadown-sampledversion,sometimesasinglenumber

64

Page 65: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+Pooling=onelayer

• Input:amatrix.Convolutionwilloperateoverwindowsofthismatrix.

• Thewindowsizedefinesthereceptivefield– Wewillrefertothewindowasx5

• Afilterisdefinedbysomeparameters(thatwillbelearned)– Ingeneral,amatrixu ofthesameshapeasathewindowandabiasb

• Convolution:Iterateoverallwindowsandapplythefilter– Typicallyhasanon-linearity(e.g.ReLU)

𝑝( = 𝑔(𝑢 ⋅ 𝑥( + 𝑏)

• Pooling:Aggregatethe𝑝(’sintoadown-sampledversion,sometimesasinglenumber

• Typically,therearemanyfilters,eachofwhicharepooledindependently

65

Page 66: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Hyperparameters

• Filtersizes:Howbigshouldthefilterbe?– Typically,3x3,5x5,etc

• Stride:howdoesthefiltermovealongtheinput?– Itcouldskipsomesteps,ornot.

• Howmanyfiltersshouldthebe?

• Padding:Shouldtherebepaddingornot?Ifso,shouldthepaddingbezerosorrandom?

• Howbigshouldthepoolingwindowbe?

• Whatkindofpooling:Average,Max,L2norm?

66

Page 67: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Example:LeNet

Anexamplenetworkusesthesebuildingblock

67

LeNet-5wasproposedbyLeCun 1998forhandwritingrecognitionHadseverallevelsofconvolution-pooling

Page 68: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

68

Page 69: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

• Goal:Torepresentasequenceofwordsasafeaturevector

• Approach:– Representthesequenceofwordsbysequence(s)ofembeddings– Convolvewithseveralfilters– Poolacrossthesequencetogetafeaturevectorofafixeddimensionality

69

Page 70: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

70

Iatecaketoday

Supposewewanttoclassifythissentence:

Goal:Torepresentasequenceofwordsasafeaturevector

Page 71: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

71

I

ate

cake

today

Wordembeddings

Goal:Torepresentasequenceofwordsasafeaturevector

Page 72: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

72

I

ate

cake

today

Wordembeddings

padding

padding

Goal:Torepresentasequenceofwordsasafeaturevector

Page 73: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

73

I

ate

cake

today

Wordembeddings

padding

padding

Applyafilter

Goal:Torepresentasequenceofwordsasafeaturevector

Page 74: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

74

I

ate

cake

today

Wordembeddings

padding

padding

Goal:Torepresentasequenceofwordsasafeaturevector

Page 75: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

75

I

ate

cake

today

Wordembeddings

padding

padding

Goal:Torepresentasequenceofwordsasafeaturevector

Page 76: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

76

I

ate

cake

today

Wordembeddings

padding

padding

Goal:Torepresentasequenceofwordsasafeaturevector

Page 77: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

77

I

ate

cake

today

Wordembeddings

padding

padding

Goal:Torepresentasequenceofwordsasafeaturevector

Page 78: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

78

I

ate

cake

today

Wordembeddings

padding

padding

Convolutionwithonefilter

Goal:Torepresentasequenceofwordsasafeaturevector

Page 79: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

79

I

ate

cake

today

Wordembeddings

padding

padding

Convolutionwithonefilter

Poolingacrossthesentence(oftenmax

pooling)togetonefeature

Goal:Torepresentasequenceofwordsasafeaturevector

Page 80: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

ConvolutionalNeuralNetworksinNLP

80

I

ate

cake

today

Wordembeddings

padding

padding

Convolutionwithmanyfilters

Poolingacrossthesentence(oftenmax

pooling)getsafeaturevector

Therecanbeseveralfilters(sometimescalledkernels,orfeaturemaps)

Goal:Torepresentasequenceofwordsasafeaturevector

Page 81: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Convolution+poolingexample

81

1. Eachwordisembeddedintoa2dvector,thewindowconcatenatesthem

2. A6x3filterwithatanh non-linearity

3. Maxpoolingovereachdimensiontoproducea3dimensionalvector

Page 82: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Examplesofconvolution+pooling

82FigurefromGoldberg2017

Thinkofconvolutionsasfeatureextractors

Anarrowconvolution(i.e.withoutanypadding)inthevectorconcatenationnotation

Awideconvolution(i.e.withpadding)inthevectorstackingnotation

Page 83: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Overview

• ConvolutionalNeuralNetworks:Abriefhistory

• ThetwooperationsinaCNN– Convolution– Pooling

• Convolution+Poolingasabuildingblock

• CNNsinNLP

• RecurrentnetworksvsConvolutionalnetworks

83

Page 84: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

Featuresfromtext

• Ifwewanttoclassifytext,weneedtorepresenttheminsomefeaturespace

• Wehave(atleast)twowaystogetfeaturesfromtextusinganeuralnetwork:– RecurrentNeuralNetworks– ConvolutionalNeuralNetworks

84

Page 85: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

RNNsvsCNNs

• RNNsmodelnon-Markoviandependencies– Canlookat(effectively)infinitewindowsaroundatargetword– Cancapturesequentialpatternsinsuchwindows

85

Page 86: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

RNNsvsCNNs

• RNNsmodelnon-Markoviandependencies– Canlookat(effectively)infinitewindowsaroundatargetword– Cancapturesequentialpatternsinsuchwindows

• CNNscaptureinformativengrams– Alsogappy n-grams– Butalsoaccountforlocalorderingpatterns

86

Page 87: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

RNNsvsCNNs

• RNNsmodelnon-Markoviandependencies– Canlookat(effectively)infinitewindowsaroundatargetword– Cancapturesequentialpatternsinsuchwindows

• CNNscaptureinformativengrams– Alsogappy n-grams– Butalsoaccountforlocalorderingpatterns

• Howdotheycompare?– Botharetrainedend-to-endwithataskloss– RNNs(specifically,BiRNNs)aremorepopulartoday…

• … butthiscanchange– CNNsallowformoreparallelism,andsomaybebettersuitedforcertain

hardware/softwareimprovements

87

Page 88: Convolutional Neural Networks for Language · Convolutional Neural Networks for Language Features from text Example: Sentiment classification The goal: Is the sentiment of a sentence

RNNsandCNNsasbuildingblocks

ThinkofthemasLegobricksforconstructinglargerarchitectures

BotharecomputationgraphsMixandmatchwithothercomputationgraphstocreatelargerneuralnetworks

Generaltoolsthatcanbeusedwithotherideasthatwehaveseenandwillsee

Eg:contextualembeddings,attention,etc.

88


Recommended