+ All Categories
Home > Documents > DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf ·...

DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf ·...

Date post: 04-Jun-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
19
DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA
Transcript
Page 1: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA

Page 2: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

2

Convolutional Networks

Deep Learning

Use Cases

GPUs

cuDNN

TOPICS COVERED

Page 3: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

3

MACHINE LEARNING

Training

Train the model from supervised data

Classification (inference)

Run the new sample through the model to predict its class/function value

Model Training

Samples

Labels

Model Samples Labels

Page 4: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

4

ARTIFICIAL NEURAL NETWORKS

Deep nets: with multiple hidden layers

Trained usually with backpropagation

Deep networks

X1

X2

X3

X4

Z1,1

Z1,2

Z1,3

Z2,1

Z2,2

Z2,3

Y1

Y2

Page 5: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

5

CONVOLUTIONAL NETWORKS

Yann LeCun et al, 1998

Local receptive field + weight sharing

“Gradient-Based Learning Applied to Document Recognition”, Proceedings of the IEEE 1998, http://yann.lecun.com/exdb/lenet/index.html

MNIST: 0.7% error rate

Page 6: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

6

High need for computational resources Low ConvNet adoption rate until ~2010

Page 7: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

7

TRAFFIC SIGN RECOGNITION

The German Traffic Sign Recognition Benchmark, 2011

GTSRB

http://benchmark.ini.rub.de/?section=gtsrb

Rank Team Error rate Model

1 IDSIA, Dan Ciresan 0.56% CNNs, trained using GPUs

2 Human 1.16%

3 NYU, Pierre Sermanet 1.69% CNNs

4 CAOR, Fatin Zaklouta 3.86% Random Forests

Page 8: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

8

NATURAL IMAGE CLASSIFICATION

Alex Krizhevsky et al, 2012

1.2M training images, 1000 classes

Scored 15.3% Top-5 error rate with 26.2% for the second-best entry for classification task

CNNs trained with GPUs

ImageNet

http://www.image-net.org/challenges/LSVRC/

Page 9: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

9

NATURAL IMAGE CLASSIFICATION

ImageNet: results for 2010-2014

15%

83%

95% 28% 26%

15%

11%

7%

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0%

5%

10%

15%

20%

25%

30%

2010 2011 2012 2013 2014

% Teams using GPUs

Top-5 error

Page 10: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

10

MODEL VISUALIZATION

Matthew D. Zeiler, Rob Fergus

Visualizing and Understanding Convolutional Networks, http://arxiv.org/abs/1311.2901 Intriguing properties of neural networks, http://arxiv.org/abs/1312.6199

Layer 1

Layer 2

Layer 5 Critique by Christian Szegedy et al

Page 11: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

11

TRANSFER LEARNING

Dogs vs. Cats, 2014

Train model on one dataset – ImageNet

Re-train the last layer only on a new dataset – Dogs and Cats

Dogs vs. Cats

https://www.kaggle.com/c/dogs-vs-cats

Rank Team Error rate Model

1 Pierre Sermanet 1.1% CNNs, model transferred from ImageNet

5 Maxim Milakov 1.9% CNN, model trained on Dogs vs. Cat dataset only

Page 12: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

12

SPEECH RECOGNITION

Acoustic model is DNN

Usually fully-connected layers

Some try using convolutional layers with spectrogram used as input

Both fit GPU perfectly

Language model is weighted Finite State Transducer (wFST)

Beam search runs fast on GPU

Acoustic model

Acoustic

Model

Language

Model

Likelihood of phonetic units

Most likely word sequence

Acoustic signal

http://devblogs.nvidia.com/parallelforall/cuda-spotlight-gpu-accelerated-speech-recognition/

Page 13: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

13

It is all about supercomputing, right?

Page 15: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

15

PEDESTRIAN + GAZE DETECTION

Ikuro Sato, Hideki Niihara, R&D Group, Denso IT Laboratory, Inc.

Real-time pedestrian detection with depth, height, and body orientation estimations

http://www.youtube.com/watch?v=9Y7yzi_w8qo

Jetson TK1

http://on-demand.gputechconf.com/gtc/2014/presentations/S4621-deep-neural-networks-automotive-safety.pdf

Page 16: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

16

How do we run DNNs on GPUs?

Page 17: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

17

CUDNN

Library for DNN toolkit developer and researchers

Contains building blocks for DNN toolkits

Convolutions, pooling, activation functions e t.c.

Best performance, easiest to deploy, future proofing

Jetson TK1 support coming soon!

developer.nvidia.com/cuDNN

cuBLAS (SGEMM for fully-connected layers) is part of CUDA toolkit, developer.nvidia.com/cuda-toolkit

cuDNN (and cuBLAS)

Page 18: DEEP LEARNING WITH GPUS - Meetupfiles.meetup.com/17372912/DeepLearningWithGPUs.pdf · 2014-10-29 · NVIDIA Tesla K40 NVIDIA Jetson TK1 CUDA cores 2880 192 Peak performance, SP 4.29

18

CUDNN

cuDNN is already integrated in major open-source frameworks

Caffe - caffe.berkeleyvision.org

Torch - torch.ch

Theano - deeplearning.net/software/theano/index.html, already has GPU support, cuDNN support coming soon!

Frameworks


Recommended