+ All Categories
Home > Documents > Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... ·...

Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... ·...

Date post: 18-Apr-2018
Category:
Upload: truongnhu
View: 231 times
Download: 3 times
Share this document with a friend
53
Deep Learning: An Overview Bradley J Erickson, MD PhD Mayo Clinic, Rochester Medical Imaging Informatics and Teleradiology Conference 1:30-2:05pm June 17, 2016
Transcript
Page 1: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning: An Overview

Bradley J Erickson, MD PhD

Mayo Clinic, Rochester

Medical Imaging Informatics and Teleradiology Conference1:30-2:05pm June 17, 2016

Page 2: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Disclosures

• Relationships with commercial interests:

– Board of OneMedNet

– Board of VoiceIT

Page 3: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

What is “Machine Learning”?

• It is a part of Artificial Intelligence

• Finds patterns in data

– Patterns that reflect properties of examples (supervised)

– Patterns that separate examples (unsupervised)

• (Other types of artificial intelligence include rules systems)

Page 4: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Machine Learning Classes

Supervised

ANN

SVM

Random Forest

Bayes

DNN

Unsupervised

Clusters

Adaptive Resonance

Reinforced

Page 5: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Machine Learning History

• Artificial Neural Networks (ANN)

– Starting point of machine learning

– Early versions didn’t work well

• Other Machine Learning Methods

– Naïve Bayes

– Support Vector Machine (SVM)

– Random Forest Classifier (RFC)

Page 6: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Artificial Neural Network/Perceptron

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

T2

Tumor

Brain

Page 7: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Artificial Neural Network/Perceptron

45

322

128

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

T2

Tumor

Brain

Page 8: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Artificial Neural Network/Perceptron

45

322

128

f(Σ)

f(Σ)

34

57

418

-68

312

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

T2

Tumor

Brain

Page 9: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Artificial Neural Network/Perceptron

45

322

128

1

0

34

57

418

-68

312

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

T2

Tumor

Brain

Page 10: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

How ANNs Learn

• Propagation

– Multiple prior layer node value times weight

– Activation function. E.g. threshold the sum

• Weight Update

– Compute error = actual output – expected output

– Weight gradient = error * input value

– New weight = old weight * gradient * learning rate

Page 11: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Learning = Optimization Problem

• Learning depends on:

– ‘Correct’ gradient directions

– ‘Correct’ gradient multiplier (learning rate)

Global Minimum

Local Minimum Small Gradient

Page 12: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Support Vector Machines

• Maps input data to new ‘space’

• Creates hyperplane that separates classes in that space

f(x)

Page 13: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning: Why the Hype?

Performance in ImageNet Challenge

Team / Software Year Error Rate

XRCE (not Deep Learning) 2011 25.8%

SuperVision (AlexNet) 2012 16.4%

Clarifai 2013 11.7%

GoogLeNet (Inception) 2014 6.66%

Andrej Karpathy (human comparison) 2014 5.1%

BN-Inception (Arxiv) 2015 4.9%

Inception-v3 (Arxiv) 2015 3.46%

Page 14: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

What is “Deep Learning”

• “Deep” because it uses many layers

– ANN typically had 3 or fewer layers

Page 15: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

DNNs have 15+ layers

Page 16: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Types of DNNs

• Convolutional Neural Network (CNN)

– Early layers have ‘windows’ of image as input

– Multiplied by a ‘kernel’ to get output

– Known as a convolution

22

14

18

44

21

13

27

64

55

32

0

28

89

32

15

31

43

65

41

33

71

21

32

4

7

1

2

1

1

2

1

2

4

2

Page 17: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Types of DNNs

• Convolutional Neural Network (CNN)

– Early layers have ‘windows’ of image as input

– Multiplied by a ‘kernel’ to get output

– Known as a convolution

22

14

18

44

21

13

27

64

55

32

0

28

89

32

15

31

43

65

41

33

71

21

32

4

7

1

2

1

1

2

1

2

4

2

Page 18: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Types of DNNs

• Convolutional Neural Network (CNN)

– Early layers have ‘windows’ of image as input

– Multiplied by a ‘kernel’ to get output

– Known as a convolution

22

14

18

44

21

13

27

64

55

32

0

28

89

32

15

31

43

65

41

33

71

21

32

4

7

1

2

1

1

2

1

2

4

2

22

* =

Page 19: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Types of DNNs

• Convolutional Neural Network (CNN)

– Early layers have ‘windows’ of image as input

– Multiplied by a ‘kernel’ to get output

– Known as a convolution

22

14

18

44

21

13

27

64

55

32

0

28

89

32

15

31

43

65

41

33

71

21

32

4

7

1

2

1

1

2

1

2

4

2

22

28

18

0

56

89

26

108

128

53/ 9

Page 20: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Types of DNNs

• Convolutional Neural Network (CNN)

– Early layers have ‘windows’ of image as input

– Multiplied by a ‘kernel’ to get output

– Known as a convolution

22

14

18

44

21

13

27

64

55

32

0

28

89

32

15

31

43

65

41

33

71

21

32

4

7

1

2

1

1

2

1

2

4

2

13

56

64

31

86

65

0

112

178

53 67/ 9

Page 21: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Why the Excitement Now?Advances That Addressed Problems

• Many layers -> Overfitting

– Implement sparsity in weights: Dropout

Page 22: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Why the Excitement Now?Advances That Addressed Problems

• Many layers -> Vanishing Gradients

– Drop out partially addresses this

– Can use ‘pre-trained’ weights for early layers, and fix those, with weights of later layers for learning higher level features

Page 23: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Typical CNNs

Convolution Pooling Pooling Convolution Pooling Fully Connected

Page 24: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Typical CNNs

Andrei Karpathy: http://karpathy.github.io/2015/10/25/selfie/

Page 25: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Why the Excitement Now?Batch Normalization

• What should be the initial set of weights connecting nodes?

– All the same = no gradients

– Random. But what range of values?

• BatchNorm:

– After each Convolutional layer

– Subtract mean / divide by standard deviation

• Simple but effective

Page 26: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Why the Excitement Now?Residual Networks

*Targ, ICLR 2016

• Residual defines if and how to pass data through from layer to layer.

• Makes deep network construction reliable

Page 27: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Why the Excitement Now?

• Deep Neural Network Theory

• Exponential Compute Power Growth

Page 28: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Moore’s Law

Computing performance doubles approximately every 18 months

Page 29: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Exponentials In Real Life

• If you put 1 drop of water into a football stadium, and then double the number of drops each minute:

– At 5 minutes, you will have 32 drops

– At 45 minutes, you will cover the field 1"

– At 55 minutes, the stadium will be full

• It is not natural for humans to grasp exponential growth

Page 30: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning Works Well on GPUs

• Naturally parallel

• Less precision (single precision FP) actually can be advantage

• Now building cards with no video output and optimized for Deep Learning (P-100)

Page 31: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

GPUs are Beating Moore’s Law

FPGA

FPGA

TPU

GPU

CPU

Ice Age 2000 2005 2010 2015 2020

1,000,000

100,000

10,000

1000

100

10

Page 32: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning Myths

• “You Need Millions of Exams to Train and Use Deep Learning Methods”

Page 33: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning Myths

• “You Need Millions of Exams to Train and Use Deep Learning Methods”

Page 34: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Ways To Avoid Need For Large Data Sets

• Data Augmentation

– Essentially, creating variants of data that are different enough that they are learnable

– Similar enough that they teaching point is kept

– Mirror/Flip/Rotate/Contrast/Crop

Page 35: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Train on Large Corpus like ImageNet

Page 36: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Freeze These Layers

Page 37: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Freeze These Layers Train this

Page 38: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Take Home Point

• Deep Learning Learns Features and Connections vs Just Connections

Hand-Crafted Feature Extraction

Learning Feature Extractor

Classifier

Classifier

Page 39: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Examples of CNN in Medical Imaging: Body Part

*Roth, Arxiv 2016

Page 40: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Examples of CNN in Medical Imaging: Segmentation

Mo

esko

ps,

IEEE

-TM

I, 2

01

6

Page 41: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Mayo: AutoEncoder for Segmentation

• Dataset– Trained on Brats 2015– Flair enhancing signal

• Preprocessing – N4 bias correction– Nuyl intensity normalization

• Autoencoders trained on 110.000 ROIs (size=12)• Time 1 hour for 155 slices (DNN would be days

or weeks)

Korfiatis, Submitted

Page 42: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

What is an AutoEncoder?

Korfiatis, Submitted

Page 43: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Dice = 0.92 over BRATS dataset

Korfiatis, Submitted

Page 44: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Machine Learning & Radiomics

• Computers find textures reflecting genomics: 1p19q

• 85 Subjects with FISH results, computed multiple textures, SVM

# Features Sens Spec F-score Accuracy

SVMAbstract

1010

0.910.95

0.870.93

0.930.96

0.910.95

Naïve Bayes 12 0.95 0.77 0.92 0.89

Erickson, Proc ASNR, 2016

Page 45: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Machine Learning & Radiomics

• 155 Subjects, GBM, MGMT Methylation

• Compute textures (T2 was best) -> SVM

Korfiatis, Med Phys, 2016

Page 46: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Deep Learning: MGMT Methylation

• Same set of patients, use VGGNet / Xfer: Az=0.86

• Autoencoder is giving nearly as good performance and trains about 10x faster

• Now testing DeepMedic and RNN

Korfiatis, unpublished

Page 47: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

The Pace of Change

Page 48: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Will Computers Replace Radiologists?

• Deep Learning will likely be able to create reports for diagnostic images in the future.

– 5 years: Mammo & CXR

– 10 years: CT Head, Chest, Abd, Pelvis, MR head, knee, shoulder, US: liver, thyroid, carotids

– 15-20 years: most diagnostic imaging

• Will likely ‘see’ more than we do today

• Will allow radiologists for focus on patient interaction and invasive procedures

Page 49: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

How Might Medicine Best Embrace Deep Learning

Page 50: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

How Might Medicine Best Embrace Deep Learning

• Algorithms for Machine Learning are rapidly improving. CNN are not the only game in town

• Hardware for Machine Learning is REALLY rapidly improving

• The amount of change in 20 years will be unbelievable

Page 51: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

How Might Medicine Best Embrace Deep Learning

• Medicine needs to remain flexible about hardware and software

• The VALUE is in the data and metadata

• Physicians are OBLIGATED to make sure the data are properly handled.

– Improper interpretation of data will lead to bad implementations and poor patient care

– Non-cooperation is also counter-productive

Page 52: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset
Page 53: Deep Learning: An Overviewfhs.mcmaster.ca/conted/documents/miit16/6. Deep Learning - Brad... · Deep Learning: An Overview Bradley J Erickson, ... AutoEncoder for Segmentation •Dataset

Recommended