+ All Categories
Home > Documents > Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study:...

Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study:...

Date post: 18-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
157
Convolutional Neural Networks II CS194: Image Manipulation, Comp. Vision, and Comp. Photo Alexei Efros, UC Berkeley, Spring 2020
Transcript
Page 1: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Convolutional Neural Networks II

CS194: Image Manipulation, Comp. Vision, and Comp. PhotoAlexei Efros, UC Berkeley, Spring 2020

Page 2: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20162

Case Study: LeNet-5[LeCun et al., 1998]

Conv filters were 5x5, applied at stride 1Subsampling (Pooling) layers were 2x2 applied at stride 2i.e. architecture is [CONV-POOL-CONV-POOL-CONV-FC]

Page 3: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Andrew NgImageNet Challenge (1000 object classes), Fei-Fei et al.

Page 4: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20164

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 images

First layer (CONV1): 96 11x11 filters applied at stride 4=>Q: what is the output volume size? Hint: (227-11)/4+1 = 55

Page 5: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20165

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 images

First layer (CONV1): 96 11x11 filters applied at stride 4=>Output volume [55x55x96]

Q: What is the total number of parameters in this layer?

Page 6: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20166

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 images

First layer (CONV1): 96 11x11 filters applied at stride 4=>Output volume [55x55x96]Parameters: (11*11*3)*96 = 35K

Page 7: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20167

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 imagesAfter CONV1: 55x55x96

Second layer (POOL1): 3x3 filters applied at stride 2

Q: what is the output volume size? Hint: (55-3)/2+1 = 27

Page 8: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20168

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 imagesAfter CONV1: 55x55x96

Second layer (POOL1): 3x3 filters applied at stride 2Output volume: 27x27x96

Q: what is the number of parameters in this layer?

Page 9: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 20169

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 imagesAfter CONV1: 55x55x96

Second layer (POOL1): 3x3 filters applied at stride 2Output volume: 27x27x96Parameters: 0!

Page 10: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201610

Case Study: AlexNet[Krizhevsky et al. 2012]

Input: 227x227x3 imagesAfter CONV1: 55x55x96After POOL1: 27x27x96...

Page 11: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201611

Case Study: AlexNet[Krizhevsky et al. 2012]

Full (simplified) AlexNet architecture:[227x227x3] INPUT[55x55x96] CONV1: 96 11x11 filters at stride 4, pad 0[27x27x96] MAX POOL1: 3x3 filters at stride 2[27x27x96] NORM1: Normalization layer[27x27x256] CONV2: 256 5x5 filters at stride 1, pad 2[13x13x256] MAX POOL2: 3x3 filters at stride 2[13x13x256] NORM2: Normalization layer[13x13x384] CONV3: 384 3x3 filters at stride 1, pad 1[13x13x384] CONV4: 384 3x3 filters at stride 1, pad 1[13x13x256] CONV5: 256 3x3 filters at stride 1, pad 1[6x6x256] MAX POOL3: 3x3 filters at stride 2[4096] FC6: 4096 neurons[4096] FC7: 4096 neurons[1000] FC8: 1000 neurons (class scores)

Page 12: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201612

Case Study: AlexNet[Krizhevsky et al. 2012]

Full (simplified) AlexNet architecture:[227x227x3] INPUT[55x55x96] CONV1: 96 11x11 filters at stride 4, pad 0[27x27x96] MAX POOL1: 3x3 filters at stride 2[27x27x96] NORM1: Normalization layer[27x27x256] CONV2: 256 5x5 filters at stride 1, pad 2[13x13x256] MAX POOL2: 3x3 filters at stride 2[13x13x256] NORM2: Normalization layer[13x13x384] CONV3: 384 3x3 filters at stride 1, pad 1[13x13x384] CONV4: 384 3x3 filters at stride 1, pad 1[13x13x256] CONV5: 256 3x3 filters at stride 1, pad 1[6x6x256] MAX POOL3: 3x3 filters at stride 2[4096] FC6: 4096 neurons[4096] FC7: 4096 neurons[1000] FC8: 1000 neurons (class scores)

Details/Retrospectives: - first use of ReLU- used Norm layers (not common anymore)- heavy data augmentation- dropout 0.5- batch size 128- SGD Momentum 0.9- Learning rate 1e-2, reduced by 10manually when val accuracy plateaus- L2 weight decay 5e-4- 7 CNN ensemble: 18.2% -> 15.4%

Page 13: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201613

(slide from Kaiming He’s recent presentation)

Page 14: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201614

Case Study: ResNet[He et al., 2015]

224x224x3

spatial dimension only 56x56!

Page 15: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201615

Case Study: ResNet [He et al., 2015]

Page 16: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201616

“You need a lot of a data if you want to train/use CNNs”

Page 17: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201617

Transfer Learning

“You need a lot of a data if you want to train/use CNNs”

Page 18: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Deep Features & their Embeddings

Page 19: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

The Unreasonable Effectiveness of Deep Features

Classes separate in the deep representations and transfer to many tasks.[DeCAF] [Zeiler-Fergus]

Page 20: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Can be used as a generic feature (“CNN code” = 4096-D vector before classifier)

query image nearest neighbors in the “code” space

Page 21: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

ImageNet + Deep Learning

Beagle

- Image Retrieval- Detection (RCNN)- Segmentation (FCN)- Depth Estimation- …

Page 22: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

ImageNet + Deep Learning

Beagle

Pose?

Boundaries?Geometry?Parts?

Materials?

Page 23: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201623

Transfer Learning with CNNs

1. Train on Imagenet

Page 24: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201624

Transfer Learning with CNNs

1. Train on Imagenet

2. If small dataset: fix all weights (treat CNN as fixed feature extractor), retrain only the classifier

i.e. swap the Softmax layer at the end

Page 25: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201625

Transfer Learning with CNNs

1. Train on Imagenet

2. If small dataset: fix all weights (treat CNN as fixed feature extractor), retrain only the classifier

i.e. swap the Softmax layer at the end

3. If you have medium sized dataset, “finetune”instead: use the old weights as initialization, train the full network or only some of the higher layers

retrain bigger portion of the network, or even all of it.

Page 26: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Lecture 7 - 27 Jan 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 - 27 Jan 201626

Transfer Learning with CNNs

1. Train on Imagenet

2. If small dataset: fix all weights (treat CNN as fixed feature extractor), retrain only the classifier

i.e. swap the Softmax layer at the end

3. If you have medium sized dataset, “finetune”instead: use the old weights as initialization, train the full network or only some of the higher layers

retrain bigger portion of the network, or even all of it.

tip: use only ~1/10th of the original learning rate in finetuning to player, and ~1/100th on intermediate layers

Page 27: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

27

Learning an Embedding

CNN

Embeddingshared representation

CNNShared W

Image 1 Image 2

Page 28: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

28

CNN

Matchingshared representation

CNNShared W

Image 1 Image 2

Learning an Embedding

Page 29: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Siamese Network w/ Contrastive Loss

Siamese Architecture[Chopra 2005, Hadsell 2006]

Page 30: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

L E A R N I N G V I S U A L S I M I L A R I T YF O R P R O D U C T D E S I G N W I T HC O N V O L U T I O N A L N E U R A L N E T W O R K SS E A N B E L L A N D K A V I T A B A L AC O R N E L L U N I V E R S I T Y

Page 31: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

T H E P R O B L E M

Name: ”Great Bowl O’Fire Sculptural Fire Bowl”

(1) “What is this?” (2) “Where is it used?”

Category: Fire pitSold by: John T. Unger, LLC

Page 32: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

T H E P R O B L E M

(1) “What is this?” (2) “Where is it used?”

Challenge: determine whether these are the same product(different resolution, viewpoint, color, lighting, occlusions)

Name: ”Great Bowl O’Fire Sculptural Fire Bowl”Category: Fire pitSold by: John T. Unger, LLC

Page 33: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

T W O K I N D S O F I M A G E SIconic In context

(From a product website) (Cropped from a scene photo)

Page 34: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

P R O J E C T I N G I N T O A J O I N T E M B E D D I N G

Embedding

Iconic In context

Page 35: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

S E A R C H U S I N G T H E E M B E D D I N G

Embedding

“What is it?”

Name: Hemel RingCategory: Hanging lightSold by: Holly Hunt

Page 36: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

S E A R C H U S I N G T H E E M B E D D I N G

Embedding

“Where is it used?”

Page 37: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Embedding

C O N T R A S T I V E L O S S : P O S I T I V E E X A M P L E

Parameters θ

Iconic (same)

In context

CNN

CNN

Loss Lp

xp

xq

Page 38: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Embedding

C O N T R A S T I V E L O S S : N E G A T I V E E X A M P L E

In context

Loss Ln

Margin m

Iconic (different)

CNN

CNN

Parameters θxq

xn

Page 39: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

C O N T R A S T I V E L O S S : A L L T O G E T H E R

[Chopra 2005, Hadsell 2006]

Minimize L(θ) with stochastic gradient descent and momentum

Margin

Page 40: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

T R A I N I N G P I P E L I N E

StochasticGradientDescent

CNN Embedding

Image pairsCNN Parameters

θ

Image database

Page 41: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context

Page 42: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context Iconic Top 4 results:

Page 43: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context

Page 44: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context Iconic Top 4 results:

Page 45: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context

Page 46: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H A T I S I T ? ”

In context Iconic Top 4 results:

Page 47: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

C O M P A R I S O N : T R A I N E D O N L Y O N C A T E G O R I E S

IconicIn context Top 4 results:

Page 48: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

IconicIn context Top 4 results:

C O M P A R I S O N : T R A I N E D O N L Y O N I M A G E N E T

Page 49: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : F A I L U R E C A S E

In context

Page 50: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : F A I L U R E C A S E

In context Iconic Top 4 results:

Page 51: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

“Maskros Pendant Lamp”

R E S U L T S : “ W H E R E I S I T U S E D ? ”

Page 52: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

R E S U L T S : “ W H E R E I S I T U S E D ? ”

"LEM Piston Stool | Design Within Reach”

Page 53: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

S E A R C H I N G A C R O S S C A T E G O R I E S

Page 54: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Color distribution cross-entropy loss with colorfulness enhancing term.

Zhang et al. 2016

[Zhang, Isola, Efros, ECCV 2016]

Designing loss functionsInput Ground truth

Page 55: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 56: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Image colorization

Cross entropy loss, with colorfulness term

“semantic feature loss” (VGG feature covariance matching objective)

[Johnson et al. 2016]

Super-resolution[Zhang et al. 2016]

Designing loss functions

Page 57: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Universal loss?

… …

Page 58: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Generated vs Real(classifier)

[Goodfellow, Pouget-Abadie, Mirza, Xu, Warde-Farley, Ozair, Courville, Bengio 2014]

Generative Adversarial Network(GANs)

Real photos

Generated images

Page 59: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Generator

[Goodfellow et al., 2014]

Page 60: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

G tries to synthesize fake images that fool D

D tries to identify the fakes

Generator Discriminator

real or fake?

[Goodfellow et al., 2014]

Page 61: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

fake (0.9)

real (0.1)

[Goodfellow et al., 2014]

Page 62: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

G tries to synthesize fake images that foolD:

real or fake?

[Goodfellow et al., 2014]

Page 63: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

G tries to synthesize fake images that fool the bestD:

real or fake?

[Goodfellow et al., 2014]

Page 64: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Loss Function

G’s perspective: D is a loss function.

Rather than being hand-designed, it is learned.

[Isola et al., 2017][Goodfellow et al., 2014]

+ L1

Page 65: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real or fake?

[Goodfellow et al., 2014]

Page 66: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real!(“Aquarius”)

[Goodfellow et al., 2014]

Page 67: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real or fake pair ?

[Goodfellow et al., 2014][Isola et al., 2017]

Page 68: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real or fake pair ?

[Goodfellow et al., 2014][Isola et al., 2017]

Page 69: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

fake pair

[Goodfellow et al., 2014][Isola et al., 2017]

Page 70: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real pair

[Goodfellow et al., 2014][Isola et al., 2017]

Page 71: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

real or fake pair ?

[Goodfellow et al., 2014][Isola et al., 2017]

Page 72: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

BW → Color

Data from [Russakovsky et al. 2015]

Page 73: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

BW → Color

Data from [Russakovsky et al. 2015]

Page 74: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Input Output Groundtruth

Data from[maps.google.com]

Page 75: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Input Output Groundtruth

Data from [maps.google

Page 76: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Labels → FacadesInput Output

Data from [Tylecek, 2013]

Page 77: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Labels → FacadesInput Output Input Output

Data from [Tylecek, 2013]

Page 78: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Day → NightInput Output Input Output Input Output

Data from [Laffont et al., 2014]

Page 79: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Thermal → RGB

Page 80: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Edges → ImagesInput Output Input Output Input Output

Edges from [Xie & Tu, 2015]

Page 81: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Sketches → ImagesInput Output Input Output Input Output

Trained on Edges → ImagesData from [Eitz, Hays, Alexa, 2012]

Page 82: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 83: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

#edges2cats [Christopher Hesse]

Ivy Tasi @ivymyt

Vitaly Vidmirov @vvid

@gods_tail

@ka92

Page 84: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Twitter-driven research: #pix2pix

Bertrand Gondouin @bgondouin

Brannon Dorsey @brannondorsey

Mario Klingemann @quasimondo

Page 85: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

© Memo Akten, “Learning to See: Gloomy Sunday”

Page 86: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Scott Eaton (http://www.scott-eaton.com/)

Page 87: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 88: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

“Do as I Do”

OpenPose

pix2pix

Page 89: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Everybody Dance NowCaroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. EfrosUC Berkeley

Source Subject Target Subject

Page 90: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Results

https://www.youtube.com/watch?v=PCBTZh41Ris&feature=youtu.be

Page 91: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CEO: our own Dr. Tinghui Zhou

Page 92: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Paired training examples Unpaired training examples

… …

Page 93: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CycleGAN, or “there and back aGAN”

[Zhu*, Park*, Isola, Efros. ICCV 2017]

……

Page 94: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Cycle-Consistency LossG(x) F(G x )x

F G x − x 1

Page 95: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

G(x) F(G x )x F(y) G(F x )𝑦𝑦

Cycle-Consistency Loss

F G x − x 1 G F y − 𝑦𝑦 1

Page 96: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 97: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Video

Page 98: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 99: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 100: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 101: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Collection Style Transfer

Van Gogh

Cezanne

Monet

Ukiyo-e

Photograph© Alexei Efros

Page 102: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 103: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CG to RealGrand Theft Auto

Page 104: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Real to CG

Page 105: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Shallower depth of field

Page 106: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Failure case

Page 107: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

A Neural Algorithmof Artistic Style

Gatys, Ecker, Bethge (arXiv 2015)

Page 108: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 109: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Van Gogh (1889)

Page 110: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Picasso (1910)

Page 111: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Munch (1893)

Page 112: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Turner (1805)

Page 113: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Kandinsky (1913)

Page 114: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Early Vision Texture Models

2.37.13.8

Heeger & Bergen (1995) Portilla & Simoncelli (2000)

Linear filter bank

Page 115: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Heeger & Bergen, SIGGRAPH‘95Start with a noise image as output Main loop:

• Match pixel histogram of output image to input

• Decompose input and output images using multi-scale filter bank (Steerable Pyramid)

• Match subband histograms of input and output pyramids

• Reconstruct input and output images (collapse the pyramids)

Heeger, Bergen, Pyramid-based texture analysis/synthesis, SIGGRAPH 1995

Page 116: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Multi-scale filter decomposition

Filter bank

Input image

Page 117: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Filter response histograms

Page 118: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Simoncelli & Portilla ’98+

Match joint histograms of pairs of filter responses at adjacent spatial locations, orientations, and scales.

Optimize using repeated projections onto statistical constraint sufraces

Page 119: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture SynthesisImage Space Model Space

Images with equalmodel response

Portilla & Simoncelli (2000)

Page 120: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Convolutional Neural Network Texture Model

2.37.13.8

Convolutional Neural Network

Gatys et al. (NIPS 2015)

Page 121: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CNN - Multiscale Filter Bank

conv1_1

pool4pool3

pool2

pool1

64

# features

64

128

256

512

Page 122: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CNN - Texture Features

Page 123: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CNN - Texture Features

64

# features

64

128

256

512

Gram Matrices

Page 124: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 125: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 126: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 127: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 128: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 129: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 130: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Texture Synthesis

Page 131: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Test Julesz’ Conjecture

Page 132: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Test Julesz’ Conjecture

Page 133: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

CNN - Texture Synthesis

Gatys et al. (NIPS 2015)

Page 134: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 135: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 136: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 137: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 138: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 139: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 140: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 141: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 142: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 143: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 144: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 145: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Artistic Style Transfer

Page 146: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Relative Weighting of Content and Style1e-4

1e-2 1e-1

1e-3

Page 147: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Different Reconstruction Layers

Conv2_2 Conv4_2

Page 148: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Different Reconstruction Layers

Conv2_2 Conv4_2

Page 149: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Different Reconstruction Layers

Conv2_2Original Conv4_2

Page 150: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

General Style Transfer

Page 151: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

General Style Transfer

Page 152: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 153: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 154: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 155: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 156: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11
Page 157: Convolutional Neural Networks IIcs194-26/sp20/Lectures/...Lecture 7 - 6 27 Jan 2016 Case Study: AlexNet [Krizhevsky et al. 2012] Input: 227x227x3 images First layer (CONV1): 96 11x11

Recommended