+ All Categories
Home > Documents > Lecture 10 Recap - GitHub Pages

Lecture 10 Recap - GitHub Pages

Date post: 17-Oct-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
63
Lecture 10 Recap 1 I2DL: Prof. Niessner, Prof. Leal-Taixé
Transcript
Page 1: Lecture 10 Recap - GitHub Pages

Lecture 10 Recap

1I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 2: Lecture 10 Recap - GitHub Pages

LeNet

• Digit recognition: 10 classes

• Conv -> Pool -> Conv -> Pool -> Conv -> FC• As we go deeper: Width, Height Number of Filters

2

60k parameters

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 3: Lecture 10 Recap - GitHub Pages

AlexNet

• Softmax for 1000 classes3

[Krizhevsky et al., ANIPS’12] AlexNet

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 4: Lecture 10 Recap - GitHub Pages

VGGNet

• Striving for simplicity– Conv -> Pool -> Conv -> Pool -> Conv -> FC– Conv=3x3, s=1, same; Maxpool=2x2, s=2

• As we go deeper: Width, Height Number of Filters• Called VGG-16: 16 layers that have weights

• Large but simplicity makes it appealing

4

[Simonyan et al., ICLR’15] VGGNet

138M parameters

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 5: Lecture 10 Recap - GitHub Pages

Residual Block

• Two layers

5

Linear LinearInput

𝑥!"#𝑥!$# 𝑥!

𝑥!"# = 𝑓(𝑊!"#𝑥! + 𝑏!"# + 𝑥!$#)

𝑥!"# = 𝑓(𝑊!"#𝑥! + 𝑏!"#)I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 6: Lecture 10 Recap - GitHub Pages

Inception Layer

6

[Szegedy et al., CVPR’15] GoogleNet

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 7: Lecture 10 Recap - GitHub Pages

Lecture 11

7I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 8: Lecture 10 Recap - GitHub Pages

Transfer Learning

8I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 9: Lecture 10 Recap - GitHub Pages

Transfer Learning

• Training your own model can be difficult with limited data and other resources

e.g.,• It is a laborious task to

manually annotate your own training dataset

à Why not reuse already pre-trained models?

9I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 10: Lecture 10 Recap - GitHub Pages

Transfer Learning

10

P1 P2

Large dataset Small dataset

Distribution Distribution

Use what has been learned for another

settingI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 11: Lecture 10 Recap - GitHub Pages

[Zeiler al., ECCV’14] Visualizing and Understanding Convolutional Networks

Transfer Learning for Images

11I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 12: Lecture 10 Recap - GitHub Pages

Transfer Learning

12

Trained on ImageNet

Feature extraction

[Donahue et al., ICML’14] DeCAF, [Razavian et al., CVPRW’14] CNN Features off-the-shelf

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 13: Lecture 10 Recap - GitHub Pages

Transfer Learning

13

Trained on ImageNet

Edges

Simple geometrical shapes (circles, etc)

Parts of an object (wheel, window)

Decision layers

[Donahue et al., ICML’14] DeCAF, [Razavian et al., CVPRW’14] CNN Features off-the-shelf

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 14: Lecture 10 Recap - GitHub Pages

Transfer Learning

14

Trained on ImageNet

New dataset with C classes

TRAIN

FROZEN

[Donahue et al., ICML’14] DeCAF, [Razavian et al., CVPRW’14] CNN Features off-the-shelf

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 15: Lecture 10 Recap - GitHub Pages

Transfer Learning

15

If the dataset is big enough train more layers with a low

learning rate

TRAIN

FROZEN

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 16: Lecture 10 Recap - GitHub Pages

When Transfer Learning makes Sense

• When task T1 and T2 have the same input (e.g. an RGB image)

• When you have more data for task T1 than for task T2

• When the low-level features for T1 could be useful to learn T2

16I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 17: Lecture 10 Recap - GitHub Pages

Now you are:

• Ready to perform image classification on any dataset

• Ready to design your own architecture

• Ready to deal with other problems such as semantic segmentation (Fully Convolutional Network)

17I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 18: Lecture 10 Recap - GitHub Pages

Recurrent Neural Networks

18I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 19: Lecture 10 Recap - GitHub Pages

Processing Sequences

• Recurrent neural networks process sequence data

• Input/output can be sequences

19I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 20: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

20

Classic Neural Networks for Image Classification

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 21: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

21

Image captioning

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 22: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

22

Language recognition

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 23: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

23

Machine translation

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 24: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

24

Event classificationSource: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 25: Lecture 10 Recap - GitHub Pages

RNNs are Flexible

25

Event classificationSource: http://karpathy.github.io/2015/05/21/rnn-effectiveness/I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 26: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• Multi-layer RNN

26

Outputs

Inputs

Hidden states

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 27: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• Multi-layer RNN

27

Outputs

Inputs

Hidden states

The hidden state will have its own internal dynamics

More expressive model!

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 28: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• We want to have notion of “time” or “sequence”

28

Hidden state inputPrevious

hidden state

𝑨" = 𝜽#𝑨"$% + 𝜽&𝒙"

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 29: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• We want to have notion of “time” or “sequence”

29

Hidden state Parameters to be learned

𝑨" = 𝜽#𝑨"$% + 𝜽&𝒙"

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 30: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• We want to have notion of “time” or “sequence”

30

Hidden state

Note: non-linearitiesignored for now

Output𝑨" = 𝜽#𝑨"$% + 𝜽&𝒙"

𝒉" = 𝜽𝒉𝑨"

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 31: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• We want to have notion of “time” or “sequence”

31

Hidden state

Same parameters for each time step = generalization!

Output𝑨" = 𝜽#𝑨"$% + 𝜽&𝒙"

𝒉" = 𝜽𝒉𝑨"

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 32: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• Unrolling RNNs

32

Same function for the hidden layers

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 33: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• Unrolling RNNs

33[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 34: Lecture 10 Recap - GitHub Pages

Basic Structure of an RNN

• Unrolling RNNs as feedforward nets

34

w1

w2

w3

w4

w1

w2

w3

w4

w1

w2

w3

w4

w1

w2

w3

w4

xt1

xt2

xt+21

xt+22xt+1

2

xt+11

Weights are the same!

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 35: Lecture 10 Recap - GitHub Pages

Backprop through an RNN

• Unrolling RNNs as feedforward nets

35

w1

w2

w3

w4

w1

w2

w3

w4

w1

w2

w3

w4

w1

w2

w3

w4

Chain rule

All the way to 𝑡 = 0

Add the derivatives at different times for each weightI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 36: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

36

I moved to Germany … so I speak German fluently.

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 37: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

• Let us forget the input

37

Same weights are multiplied over and over again

𝑨" = 𝜽#𝑨"$% + 𝜽&𝒙"

𝑨" = 𝜽𝒄"𝑨(

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 38: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

38

What happens to small weights?

What happens to large weights?

Vanishing gradient

Exploding gradient

𝑨" = 𝜽𝒄%𝑨(

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 39: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

• If 𝜽 admits eigendecomposition

39

Diagonal of this matrix are the eigenvalues

Matrix of eigenvectors

𝑨" = 𝜽𝒄%𝑨(

𝜽 = 𝑸𝚲𝑸&

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 40: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

• If 𝜽 admits eigendecomposition

• Orthogonal 𝜽 allows us to simplify the recurrence

40

𝑨% = 𝑸𝚲%𝑸&𝑨'

𝜽 = 𝑸𝚲𝑸&

𝑨" = 𝜽"𝑨(

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 41: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

41

What happens to eigenvalues with magnitude less than one?

What happens to eigenvalues with magnitude larger than one?

Vanishing gradient

Exploding gradient Gradient clipping

𝑨" = 𝑸𝚲)𝑸*𝑨(

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 42: Lecture 10 Recap - GitHub Pages

Long-term Dependencies

• Simple recurrence

42

Let us just make a matrix with eigenvalues = 1

Allow the cell to maintain its “state”

𝑨" = 𝜽𝒄%𝑨(

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 43: Lecture 10 Recap - GitHub Pages

Vanishing Gradient

• 1. From the weights

• 2. From the activation functions (𝑡𝑎𝑛ℎ)

43

𝑨" = 𝜽𝒄%𝑨(

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 44: Lecture 10 Recap - GitHub Pages

𝑨" = 𝜽"𝑨(

Vanishing Gradient

• 1. From the weights

• 2. From the activation functions (𝑡𝑎𝑛ℎ)

44

1?

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 45: Lecture 10 Recap - GitHub Pages

Long Short Term Memory

45

[Hochreiter et al., Neural Computation’97] Long Short-Term Memory

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 46: Lecture 10 Recap - GitHub Pages

Long-Short Term Memory Units

Simple RNN has tanh as non-linearity

46[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 47: Lecture 10 Recap - GitHub Pages

Long-Short Term Memory Units

LSTM

47[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 48: Lecture 10 Recap - GitHub Pages

Long-Short Term Memory Units

• Key ingredients • Cell = transports the information through the unit

48[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 49: Lecture 10 Recap - GitHub Pages

Long-Short Term Memory Units

• Key ingredients • Cell = transports the information through the unit• Gate = remove or add information to the cell state

49

Sigmoid

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 50: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step

• Forget gate 𝒇" = 𝑠𝑖𝑔𝑚(𝜽&+𝒙" + 𝜽,+𝒉"$% + 𝒃+)

50

Decides when to erase the cell state

Sigmoid = output between 0 (forget) and 1 (keep)

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 51: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step

• Input gate 𝒊" = 𝑠𝑖𝑔𝑚(𝜽&-𝒙" + 𝜽,-𝒉"$% + 𝒃-)

51

Decides which values will be updated

New cell state, output from a tanh (−1,1)

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 52: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step

• Element-wise operations

52

Previous states

Current state

𝑪% = 𝒇% ⊙𝑪%$# +𝒊%⊙𝒈%

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 53: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step• Output gate 𝒉" = 𝒐"⊙tanh 𝑪"

53

Decides which values will be outputted

Output from a tanh (−1, 1)

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 54: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step

• Forget gate 𝒇" = 𝑠𝑖𝑔𝑚(𝜽&+𝒙" + 𝜽,+𝒉"$% + 𝒃+)

• Input gate 𝒊" = 𝑠𝑖𝑔𝑚(𝜽&-𝒙" + 𝜽,-𝒉"$% + 𝒃-)

• Output gate 𝒐" = 𝑠𝑖𝑔𝑚(𝜽&.𝒙" + 𝜽,.𝒉"$% + 𝒃.)

• Cell update 𝒈" = 𝑡𝑎𝑛ℎ(𝜽&/𝒙" + 𝜽,/𝒉"$% + 𝒃/)

• Cell 𝑪" = 𝒇" ⊙𝑪"$% +𝒊"⊙𝒈"

• Output 𝒉" = 𝒐"⊙tanh 𝑪"

54I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 55: Lecture 10 Recap - GitHub Pages

LSTM: Step by Step

• Forget gate 𝒇" = 𝑠𝑖𝑔𝑚(𝜽&+𝒙" + 𝜽,+𝒉"$% + 𝒃+)

• Input gate 𝒊" = 𝑠𝑖𝑔𝑚(𝜽&-𝒙" + 𝜽,-𝒉"$% + 𝒃-)

• Output gate 𝒐" = 𝑠𝑖𝑔𝑚(𝜽&.𝒙" + 𝜽,.𝒉"$% + 𝒃.)

• Cell update 𝒈" = 𝑡𝑎𝑛ℎ(𝜽&/𝒙" + 𝜽,/𝒉"$% + 𝒃/)

• Cell 𝑪" = 𝒇" ⊙𝑪"$% +𝒊"⊙𝒈"

• Output 𝒉" = 𝒐"⊙tanh 𝑪"

55

Learned through backpropagation

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 56: Lecture 10 Recap - GitHub Pages

LSTM: Vanishing Gradients?

• Cell

56

• 1. From the weights

• 2. From the activation functions

weightsIdentity function

1 for important information

𝑪% = 𝒇% ⊙𝑪%$# +𝒊%⊙𝒈%

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 57: Lecture 10 Recap - GitHub Pages

LSTM

• Highway for the gradient to flow

57[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 58: Lecture 10 Recap - GitHub Pages

LSTM: Dimensions

• Cell update 𝒈" = 𝑡𝑎𝑛ℎ(𝜽&/𝒙" + 𝜽,/𝒉"$% + 𝒃/)

58

128

128

What operation do I need to do to my input to get a 128 vector representation?

128 128 128

When coding an LSTM, we have to define the size of the hidden state

Dimensions need to match

[Olah, https://colah.github.io ’15] Understanding LSTMsI2DL: Prof. Niessner, Prof. Leal-Taixé

Page 59: Lecture 10 Recap - GitHub Pages

General LSTM Units

59

• Input, states, and gates not limited to 1st-order tensors

• Gate functions can consist of FC and CNN layers

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 60: Lecture 10 Recap - GitHub Pages

ConvLSTM for Video Sequences

60

• Input, hidden, and cell states are higher order tensors (i.e. images)

• Gates have CNN instead of FC layers

Hidden state

Cell stateGates

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 61: Lecture 10 Recap - GitHub Pages

RNNs in Computer Vision

• Caption generation

61

[Xu et al., PMLR’15] Neural Image Caption Generation

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 62: Lecture 10 Recap - GitHub Pages

RNNs in Computer Vision

• Instance segmentation

62

[Romera-Paredes et al., ECCV’16] Recurrent Instance Segmentation

I2DL: Prof. Niessner, Prof. Leal-Taixé

Page 63: Lecture 10 Recap - GitHub Pages

See you next time!

63I2DL: Prof. Niessner, Prof. Leal-Taixé


Recommended