+ All Categories
Home > Documents > Deep Learning with CNNs - Aris...

Deep Learning with CNNs - Aris...

Date post: 22-Jun-2018
Category:
Upload: dohanh
View: 213 times
Download: 0 times
Share this document with a friend
41
Deep Learning with CNNs University of Rome "La Sapienza" Dep. of Computer, Control and Management Engineering A. Ruberti Valsamis Ntouskos, ALCOR Lab
Transcript
Page 1: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Deep Learning with CNNs

University of Rome "La Sapienza"Dep. of Computer, Control and Management Engineering A. Ruberti

Valsamis Ntouskos, ALCOR Lab

Page 2: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Outline

Deep Learning with CNNs

• Introduction - Motivation

• Theoretical aspects

• Brief history of CNNs

• Evolution of CNNs for image classification

• Applications of CNNs in computer vision

Page 3: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Deep Learning with CNNs

Deep Learning with CNNs

Compositional Models

Learned End-to-End

Hierarchy of Representations- vision: pixel, motif, part, object

- text: character, word, clause, sentence

- speech: audio, band, phone, word

concrete abstractlearning

Slides from Caffe framework tutorial @ CVPR2015

Page 4: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Deep Learning with CNNs

Deep Learning with CNNs

Compositional Models

Learned End-to-End

Back-propagation jointly learns

all of the model parameters to

optimize the output for the task.

Slides from Caffe framework tutorial @ CVPR2015

Page 5: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Motivation

Deep Learning with CNNs

Up to now we treated inputs as general feature vectors

In some cases inputs have special structure:• Audio• Images• Videos

Signals: Numerical representations of physical quantities

Deep learning can be directly applied on signals by using suitable operators

Page 6: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Motivation

Deep Learning with CNNs

. . . 0.0468 0.0468 0.0468 0.0390 0.0390 0.0390 0.0546 0.0625 0.0625 0.0390 0.0312 0.0468 0.0625 . . .

1D data - (variable length) vectors

Audio

Page 7: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Motivation

Deep Learning with CNNs

Images

A sequence of images sampled through time - 3D data

2D data - matrices

Video

Page 8: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

What is a CNN?

Deep Learning with CNNs

Page 9: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Convolution

From Steve Seitz and Richard Szeliski's slides(https://courses.cs.washington.edu/courses/cse576/08sp/)

Interactive examples:http://setosa.io/ev/image-kernels/

Luca

Page 10: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Convolution

• Image filtering is

based on convolution

with special kernels

Page 11: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Pooling

• Introduces subsampling

Page 12: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Activation

Standard way to model a neuron

f(x) = tanh(x) or f(x) = (1 + e-x)-1

Very slow to train (saturation)

Non-saturating nonlinearity (RELU)f(x) = max(0, x)Quick to train

Page 13: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Regularization

Dropout

• Applied on the fully-connected layers

• During training prune nodes with probability α

• During testing nodes are weighed by α

Image from Srivastava et al.. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting"

Page 14: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Every convolutional layer of a CNN transforms the 3D input

volume to a 3D output volume of neuron activations.

A regular 3-layer Neural Network

Material from Fei-Fei’s group

Page 15: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Some theory

Deep Learning with CNNs

Each neuron is connected to a

local region in the input volume

spatially, but to all channels

The neurons still compute a dot

product of their weights with the

input followed by a non-linearity

Material from Fei-Fei’s group

Page 16: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Terminology

• Kernel: matrix corresponding to convolution / filter

• Depth: number of feature maps / filters (d)

• Depth slice: a single feature map

• Padding: zero filled addition outer rows/columns (p)

• Stride: step of sliding kernel (s)

– e.g. value 1 takes one pixel at a time

• Receptive field: 2D dimensions of kernel (𝑤𝑘 × ℎk)

• Weight or parameter sharing: Parameters of the

filter are shared through a depth slice

– i.e. parameters are the same for all units of the same

features map

Deep Learning with CNNs

Page 17: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Algorithms

Deep Learning with CNNs

• Each* neuron/layer is differentiable!

• Just apply backpropagation (chain-rule)

• Use standard gradient-based optimization algorithms

(SGD, AdaGrad, …)

• The devil lies in the details though …

▪Choosing hyperparameters / loss-function

▪Exploding/Vanishing gradients – batch normalization

▪Overfitting – Regularization

▪Cost of performing experiments

▪Convergence

▪…

*what about max-pooling?

Page 18: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Kernels and

Feature maps

Deep Learning with CNNs

Material from Fei-Fei’s group

Page 19: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Brief history of CNNs

Foundational work done in the middle of the 1900s

• 1940s-1960s: Cybernetics [McCulloch and Pitts 1943,

Hebb 1949, Rosenblatt 1958]

• 1980s-mid 1990s: Connectionism [Rumelhart 1986,

Hinton 1989]

• 1990s: modern convolutional networks [LeCun et al.

1998], LSTM [Hochreiter & Schmidhuber 1997,

MNIST and other large datasets]

Deep Learning with CNNs

Page 20: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Brief history of CNNs

Deep Learning with CNNs

Hubel & Wiesel [60s] Simple & Complex cells architecture:

Hubel & Wiesel [60s] Simple & Complex cells architecture Fukushima’s Neocognitron [70s]

Yann LeCun’s Early CNNs [80s]:

Page 21: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Brief history of CNNs

Deep Learning with CNNs

Convolutional Networks: 1989

LeNet: a layered model composed of convolution and subsampling operations followed

by a holistic representation and ultimately a classifier for handwritten digits. [ LeNet ]

Page 22: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Recent success

• Parallel Computation (GPU)

• Larger training sets

• International Competitions

• Theoretical advancements

– Dropout

– ReLUs

– Batch Normalization

Deep Learning with CNNs

Page 24: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Recent success

ImageNet

• Over 15M labeled high resolution images

• Roughly 22K categories

• Collected from web and labeled by Amazon Mechanical Turk

Deep Learning with CNNs

Larger training sets

Page 25: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Recent success

ILSVRC

• Annual competition of image classification at large scale

• 1.2M images in 1K categories

• Classification: make 5 guesses about the image label

Deep Learning with CNNs

Competitions

EntleBucher Appenzeller

Page 26: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Evolution of CNNs for image classification

Deep Learning with CNNs

AlexNet: a layered model composed of convolution, subsampling, and further operations

followed by a holistic representation and all-in-all a landmark classifier on

ILSVRC12. [ AlexNet ]

Convolutional Nets: 2012

AlexNet

Page 27: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Evolution of CNNs for image classification

Deep Learning with CNNs

Convolutional Nets: 2014

ILSVRC14 Winners: ~6.6% Top-5 error

- GoogLeNet: composition of multi-scale

dimension-reduced modules

+ depth

+ data

+ dimensionality reduction

Page 28: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Evolution of CNNs for image classification

Deep Learning with CNNs

Convolutional Nets: 2014

ILSVRC14 Winners: ~6.6% Top-5 error

- VGG: 16 layers of 3x3 convolution

interleaved with max pooling +

3 fully-connected layers

+ depth

+ data

+ dimensionality reduction

Page 29: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Evolution of CNNs for image classification

Deep Learning with CNNs

Convolutional Nets: 2015

ResNet

ILSVRC15 Winner: ~3.6% Top-5 error

Intuition: Easier to learn zero than identity function

Page 30: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Evolution of CNNs for image classification

Deep Learning with CNNs

Page 31: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Reasonable questions

• Is this just for a particular dataset? – No!

Deep Learning with CNNs

Slides from ICCV 2015 Math of Deep Learning tutorial

Page 32: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Reasonable questions

Deep Learning with CNNs

Object Localization[R-CNN, HyperColumns, Overfeat, etc.]

Pose estimation [Thomson et al, CVPR’15]

• Is this just for a particular task? – No!

Slides from ICCV 2015 Math of Deep Learning tutorial

Page 33: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Reasonable questions

Deep Learning with CNNs

Semantic Segmentation[Pinhero, Collobert, Dollar, ICCV’15]

• Is this just for a particular task? – No!

Slides from ICCV 2015 Math of Deep Learning tutorial

Page 34: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Reasonable questions

Deep Learning with CNNs

• Is this just for a particular task? – No!

Slides from ICCV 2015 Math of Deep Learning tutorial

Page 35: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Fine Tuning

Deep Learning with CNNs

Dogs vs. Cats

top 10 in 10 minutes

Take a pre-trained model and fine-tune to new tasks [DeCAF] [Zeiler-Fergus] [OverFeat]

© kaggle.com

Your Task

Style

RecognitionLots of Data

ImageNet

Page 36: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Pixelwise Prediction

Deep Learning with CNNs

Fully convolutional networks for pixel prediction

in particular semantic segmentation

- end-to-end learning

- efficient inference and learning

100 ms per-image prediction

- multi-modal, multi-task

Applications

- semantic segmentation

- denoising

- depth estimation

- optical flow

Jon Long* & Evan Shelhamer*,

Trevor Darrell. CVPR’15

Page 37: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Dealing with sequences

Deep Learning with CNNs

Recurrent Nets and Long Short Term Memories (LSTM)

are sequential models

- video

- language

- dynamics

learned by backpropagation through time

Recurrent Networks for Sequences

LRCN: Long-term Recurrent Convolutional Network

- activity recognition (sequence-in)

- image captioning (sequence-out)

- video captioning (sequence-to-sequence)

LRCN:

recurrent + convolutional

for visual sequences

Page 38: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Dealing with sequences

Deep Learning with CNNs

Visual Sequence Tasks

Jeff Donahue et al. CVPR’15

50

Based on Long short-term memory (LSTM) layers

Page 39: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

What’s next?

Various questions/problems are still open:

• Learning with constraints / on manifolds

• Using high-level knowledge/structure

• Exploring the mathematics of the networks

– what types of functions they can represent?

– are these functions useful/interesting?

– convergence/efficiency

• Rotation invariance (group operators)

• CNNs can be easily ‘fooled’

• …

Deep Learning with CNNs

Page 40: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Resources

Frameworks:

• Caffe/Caffe 2 (UC Berkeley) | C/C++, Python, Matlab

• TensorFlow (Google) | C/C++, Python, Java, Go

• Theano (U Montreal) | Python

• CNTK (Microsoft) | Python, C++ , C#/.Net, Java

• Torch/PyTorch (Facebook) | Lua/Python

• MxNet (DMLC) | Python, C++, R, Perl, …

• Darknet (Redmon J.) | C

• …

Deep Learning with CNNs

Page 41: Deep Learning with CNNs - Aris Anagnostopoulosaris.me/contents/teaching/data-mining-2017/slides/Deep_Learning_pt... · by a holistic representation and ultimately a classifier for

Resources

High-level libraries:

• Keras | Backends: TensorFlow (TF), Theano

Models:

• Depends on the framework, e.g.

– https://github.com/BVLC/caffe/wiki/Model-Zoo (Caffe)

– https://github.com/tensorflow/models/tree/master/research (TF)

Interactive Interfaces:

• DIGITS (NVIDIA) | Caffe, TF, Torch

• TensorBoard (TF)

Tools:

• http://ethereon.github.io/netscope (for networks defined in protobuf )

Deep Learning with CNNs


Recommended