+ All Categories
Home > Documents > Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash...

Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash...

Date post: 29-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
24
Fast Computation of Uncertainty in Deep Learning Mohammad Emtiyaz Khan RIKEN Center for AI Project, Tokyo, Japan Joint work with Wu Lin (UBC), Didrik Nielsen (RIKEN), Voot Tangkaratt (RIKEN) Yarin Gal (University of Oxford), Akash Srivastava (University of Edinburgh) Zuozhu Liu (SUTD, Singapore)
Transcript
Page 1: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertaintyinDeepLearning

MohammadEmtiyazKhanRIKENCenterforAIProject,Tokyo,Japan

JointworkwithWuLin(UBC),Didrik Nielsen(RIKEN),Voot Tangkaratt (RIKEN)

Yarin Gal(UniversityofOxford),AkashSrivastava(UniversityofEdinburgh)Zuozhu Liu(SUTD,Singapore)

Page 2: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Uncertainty

Quantifiestheconfidenceinthepredictionofamodel,i.e.,howmuch

itdoesnotknow.

2

Page 3: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Example:WhichisaBetterFit?

Blue

Red

57%

43%

Freq

uency

MagnitudeofEarthquake

RealdatafromTohoku(Japan).ExampletakenfromNateSilver’sbook“Thesignalandnoise” 4

Page 4: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Example:WhichisaBetterFit?Freq

uency

MagnitudeofEarthquake

Whenthedataisscarceandnoisy,e.g.,inmedicine,androbotics.

Page 5: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

OutlineoftheTalk

• Uncertaintyisimportant– E.g.,whendataarescarce,missing,unreliableetc.

• Uncertaintycomputationisdifficult– Duetolargemodelanddatausedindeeplearning

• Thistalk:fastcomputationofuncertainty– Bayesiandeeplearning–Methodsthatareextremelyeasytoimplement

5

Page 6: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Uncertainty inDeepLearning

Whyisitdifficulttoestimateit?

6

Page 7: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ANaïveMethod

7

p(D|✓) =NY

i=1

p(yi|f✓(xi))

Parameters

Data

Neuralnetwork

InputOutput

✓ ⇠ p(✓)

Generate

Priordistribution

Page 8: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

BayesianInference

8

Bayes’rule:

Intractableintegral

p(✓|D) =p(D|✓)p(✓)Rp(D|✓)p(✓)d✓

Posteriordistribution

Narrow Wide

Page 9: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ApproximateBayesianInference

9

minµ,�2

DhN (✓|µ,�2)kp(✓|D)

i

Variational Inference:ApproximatetheposteriorbyaGaussiandistribution

VarianceMean

Optimizeusinggradientmethods(SGD/Adam)– BayesbyBackprop (Blundelletal.2015),PracticalVI(Gravesetal.2011),

Black-boxVI(Rangnathan etal.2014)andmanymore….

Computationandmemoryintensive,andrequiresubstantialimplementationeffort

Page 10: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationof(Approximate)UncertaintyApproximatebyaGaussiandistribution,

andfinditby“perturbing”theparametersduringbackpropagation

10

Page 11: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertainty

11

1. Selectaminibatch2. Computegradientusingbackpropagation3. Computeascalevectortoadaptthelearningrate4. Takeagradientstep

✓ ✓ + learning rate ⇤ gradientpscale + 10�8

Adaptivelearning-ratemethod(e.g.,Adam)

NY

i=1

p(yi|f✓(xi)) ✓ ⇠ N (✓|0, I)

Page 12: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertainty

12

1. Selectaminibatch2. Computegradientusingbackpropagation3. Computeascalevectortoadaptthelearningrate4. Takeagradientstep

0.Sample𝜖 fromastandardnormaldistribution

NY

i=1

p(yi|f✓(xi))

✓temp ✓ + ✏ ⇤pN ⇤ scale + 1

✓ ✓ + learning rate ⇤ gradient + ✓/Npscale + 1/N

✓ ⇠ N (✓|0, I)

Variational Adam(Vadam)

Page 13: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Illustration:Classification

13

Logisticregression(30datapoints,2dimensionalinput).

SampledfromGaussianmixturewith2components

Page 14: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

AdamvsVadam

14

Forbothalgorithms,Minibatch of5

Learning_rate =0.01Priorprecision=0.01

Adam

Vadam (mean)

Vadam (samples)

Page 15: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Whydoesthiswork?

• Thisalgorithmisobtainedbyreplacing“gradients”by“naturalgradients”.– SeeourICML2018paper.

• ThescalinginnaturalgradientisrelatedtothescalinginNewtonmethod.

• AnapproximationtotheHessianresultsinAdam.

• Somecaveats:Choosesmallminibatches,betterresultsareobtainedwithVOGN.

15

Page 16: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Faster,Simpler,andMoreRobustRegressiononAustralian-Scaledatasetusingdeepneuralnetsforvariousnumberofminibatch size.

16

ExistingMethod(BBVI)Ourmethod(Vadam)Ourmethod(VOGN)

Page 17: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Faster,Simpler,andMoreRobust

17

ResultsonMNISTdigitclassification(forvariousvaluesofGaussianpriorprecisionparameterλ)

ExistingMethod(BBVI)Ourmethod(Vadam)

Page 18: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

DeepReinforcementLearning

18

NoExploration(SGD)Reward=2860

ExplorationusingVadamReward=5264

Page 19: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ReduceOverfittingwithVadam

19

Vadam showsconsistenttrain-testperformance,whileAdamoverfits whenNissmall

BNNclassificationona1a- a9adatasets

AdamTestAdamTest

AdamTrain AdamTrain

AdamTest

AdamTrain

Vadam Testandtrain Vadam Testandtrain

Vadam Testandtrain AdamTest

AdamTrain

Page 20: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

20

AvoidingLocal

MinimaAnexampletakenfromCasellaand

Robert’sbook.

Vadam reachestheflat

minima,butGDgetsstuckatalocalminima.

Optimizationbysmoothing,Gaussianhomotopy/blurringetc.,EntropySGLDetc.

Page 21: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Summary

• Uncertaintyisimportant,especiallywhenthedataisscarce,missing,unreliableetc.

• Wecanobtainuncertaintycheaplywithverylittleeffort– Bayesiandeeplearning

• Itworksreasonablywellonourbenchmarks.

21

Page 22: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

OpenQuestions

• Qualityofuncertaintyestimates– Applicationtolifescience?– Checkoutthe“Bayesiandeeplearning”workshopatNIPS2018.

• Estimatingvarioustypesofuncertainty–Modeluncertaintyvsdatauncertainty– Applicationsplayabigrolehere

• Isuncertaintyindeeplearninguseful?–Multiplelocalminimamakeitdifficulttoestablish

22

Page 23: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

References

23

https://emtiyaz.github.io

Page 24: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Thanks!

https://emtiyaz.github.io

24


Recommended