+ All Categories
Home > Documents > Tensorflow CNN turorial -...

Tensorflow CNN turorial -...

Date post: 14-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
44
Tensorflow CNN turorial 2017/03/10
Transcript
Page 1: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Tensorflow CNNturorial2017/03/10

Page 2: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Lenet-5

[LeCun etal.,1998]

Page 3: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Today’sexample

Imagesource

Page 4: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer
The slides are from1. “Lecture 13: Neural networks for machine vision, Dr. Richard E. Turner”2. Lecture 7 & 12 in Stanford CS231n
Page 5: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

3

Convolution Layer

32x32x3 image

width

height

32

depth

Page 6: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

32

3

Convolution Layer

5x5x3 filter

32x32x3 image

Convolve the filter with the imagei.e. “slide over the image spatially, computing dot products”

Page 7: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

32

3

Convolution Layer

5x5x3 filter

32x32x3 image

Convolve the filter with the imagei.e. “slide over the image spatially, computing dot products”

Filters always extend the full depth of the input volume

Page 8: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

32

3

Convolution Layer

32x32x3 image 5x5x3 filter

1 number:the result of taking a dot product between the filter and a small 5x5x3 chunk of the image(i.e. 5*5*3 = 75-dimensional dot product + bias)

Page 9: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

Page 10: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

Page 11: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

Page 12: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

Page 13: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

=> 5x5 output

7

7x7 input (spatially) assume 3x3 filter

7

Page 14: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

32

3

Convolution Layeractivation map

32x32x3 image5x5x3 filter

1

28

28

convolve (slide) over all spatial locations

Page 15: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

32

3

Convolution Layer

32x32x3 image 5x5x3 filter

activation maps

1

28

28

convolve (slide) over all spatial locations

consider a second, green filter

Page 16: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32

3 6

28

activation maps

32

28

Convolution Layer

For example, if we had 6 5x5 filters, we’ll get 6 separate activation maps:

We stack these up to get a “new image” of size 28x28x6!

Page 17: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Preview: ConvNet is a sequence of Convolution Layers, interspersed with activation functions

32

32

3

28

28

6

CONV, ReLUe.g. 65x5x3filters

Page 18: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Preview: ConvNet is a sequence of Convolutional Layers, interspersed with activation functions

32

32

3

CONV, ReLUe.g. 65x5x3filters 28

28

6

CONV, ReLUe.g. 10 5x5x6 filters

CONV, ReLU

….

10

24

24

Page 19: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

32x32 input convolved repeatedly with 5x5 filters shrinks volumes spatially! (32 -> 28 -> 24 ...). Shrinking too fast is not good, doesn’t work well.

32

32

3

CONV, ReLUe.g. 65x5x3filters 28

28

6

CONV, ReLUe.g. 10 5x5x6 filters

CONV, ReLU

….

10

24

24

Page 20: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

A closer look at spatial dimensions:

Page 21: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

A closer look at spatial dimensions:

Page 22: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

A closer look at spatial dimensions:

Page 23: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7

7x7 input (spatially) assume 3x3 filter

7

A closer look at spatial dimensions:

Page 24: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

=> 5x5 output

7

7x7 input (spatially) assume 3x3 filter

7

A closer look at spatial dimensions:

Page 25: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7x7 input (spatially) assume 3x3 filter applied with stride 2

7

7

A closer look at spatial dimensions:

Page 26: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7x7 input (spatially) assume 3x3 filter applied with stride 2

7

7

A closer look at spatial dimensions:

Page 27: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7x7 input (spatially) assume 3x3 filter applied with stride 2=> 3x3 output!

7

7

A closer look at spatial dimensions:

Page 28: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7x7 input (spatially) assume 3x3 filter applied with stride 3?

7

7

A closer look at spatial dimensions:

Page 29: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

7x7 input (spatially) assume 3x3 filter applied with stride 3?

7

7

A closer look at spatial dimensions:

doesn’t fit!cannot apply 3x3 filter on 7x7 input with stride 3.

Page 30: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

In practice: Common to zero pad the border

e.g. input 7x73x3 filter, applied with stride 1pad with 1 pixel border => what is the output?

7x7 output!

0 0 0 0 0 0

0

0

0

0

Page 31: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

In practice: Common to zero pad the border

e.g. input 7x73x3 filter, applied with stride 1pad with 1 pixel border => what is the output?

7x7 output!in general, common to see CONV layers with stride 1, filters of size FxF, and zero-padding with (F-1)/2. (will preserve size spatially)e.g. F = 3 => zero pad with 1

F = 5 => zero pad with 2F = 7 => zero pad with 3

0 0 0 0 0 0

0

0

0

0

Page 32: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Pooling layer- makes the representations smaller and more manageable- operates over each activation map independently:

Page 33: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

1 1 2 4

5 6 7 8

3 2 1 0

1 2 3 4

Single depth slice

x

y

max pool with 2x2 filters and stride 2 6 8

3 4

MAX POOLING

Page 34: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Tensorflow implementation

• WeightInitialization

• ConvolutionandPooling• Convolutionlayer• Fullyconnectedlayer• ReadoutLayer

• Referenceandimagesource:https://www.tensorflow.org/get_started/mnist/pros

(Seesection‘BuildaMultilayerConvolutionalNetwork’)

Page 35: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Input(placeholder)

tf.placeholder

xisplaceholderforinputimage.yislabel withone-hotrepresentation,soseconddimensionofyisequaltonumberofclasses.

None indicatesthatthefirstdimension,correspondingtothebatchsize,whichcanbeanysize.

Page 36: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

WeightInitialization

tf.truncated_normal

Thesevariablewillbeinitializedwhenuserrun‘tf.global_variables_initializer’.Nowtheyarejustnodesinagraphwithoutanyvalue.

Page 37: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

ConvolutionandPooling

Stridesis4-d,followingNHWCformat.(Num_samples xHeightxWidthxChannels)

Recallstridesandpadding.padding=‘SAME’meansapplypaddingtokeepoutputsizeassameasinputsize.

Conv2dpadswithzeros andmax_pool padswith–inf.tf.nn.conv2d

tf.nn.max_pool

Page 38: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Convolutionlayer

tf.reshape

Page 39: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Convolutionlayer

Seehowthecodecreatesamodelbywrappinglayers.Becareofshape ofeachlayer.-1meansmatchthesizeofthatdimensioniscomputedsothatthetotalsizeremainsconstant.

tf.reshape

Page 40: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Reshape

tf.reshape

Forexample:

tensor‘t’is[[1,2],[3,4],[5,6],[7,8]],sothasshape[4,2]

(1) reshape(t,[2,4])è [[1,2,3,4],[5,6,7,8]]

(2) reshape(t,[-1,4])è [[1,2,3,4],[5,6,7,8]]

-1wouldbecomputedandbecomes‘2’

Page 41: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Fullyconnectedlayer

Flattenallthe mapsandconnectthemwithfullyconnectedlayer.Again,becareofshape.

Page 42: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

ReadoutLayer

Usealayertomatchoutputsize.Done!

Page 43: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

TrainingandEvaluation(optional)

Page 44: Tensorflow CNN turorial - 國立臺灣大學speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2017/Lecture/... · 2017. 3. 25. · 2017/03/10. Lenet-5 [LeCunet al., 1998] ... ConvolutionLayer

Recommendation

• Searchforeachfunction,andyou’llwhat’severythinggoingon.


Recommended