+ All Categories
Transcript
Page 1: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertaintyinDeepLearning

MohammadEmtiyazKhanRIKENCenterforAIProject,Tokyo,Japan

JointworkwithWuLin(UBC),Didrik Nielsen(RIKEN),Voot Tangkaratt (RIKEN)

Yarin Gal(UniversityofOxford),AkashSrivastava(UniversityofEdinburgh)Zuozhu Liu(SUTD,Singapore)

Page 2: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Uncertainty

Quantifiestheconfidenceinthepredictionofamodel,i.e.,howmuch

itdoesnotknow.

2

Page 3: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Example:WhichisaBetterFit?

Blue

Red

57%

43%

Freq

uency

MagnitudeofEarthquake

RealdatafromTohoku(Japan).ExampletakenfromNateSilver’sbook“Thesignalandnoise” 4

Page 4: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Example:WhichisaBetterFit?Freq

uency

MagnitudeofEarthquake

Whenthedataisscarceandnoisy,e.g.,inmedicine,androbotics.

Page 5: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

OutlineoftheTalk

• Uncertaintyisimportant– E.g.,whendataarescarce,missing,unreliableetc.

• Uncertaintycomputationisdifficult– Duetolargemodelanddatausedindeeplearning

• Thistalk:fastcomputationofuncertainty– Bayesiandeeplearning–Methodsthatareextremelyeasytoimplement

5

Page 6: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Uncertainty inDeepLearning

Whyisitdifficulttoestimateit?

6

Page 7: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ANaïveMethod

7

p(D|✓) =NY

i=1

p(yi|f✓(xi))

Parameters

Data

Neuralnetwork

InputOutput

✓ ⇠ p(✓)

Generate

Priordistribution

Page 8: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

BayesianInference

8

Bayes’rule:

Intractableintegral

p(✓|D) =p(D|✓)p(✓)Rp(D|✓)p(✓)d✓

Posteriordistribution

Narrow Wide

Page 9: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ApproximateBayesianInference

9

minµ,�2

DhN (✓|µ,�2)kp(✓|D)

i

Variational Inference:ApproximatetheposteriorbyaGaussiandistribution

VarianceMean

Optimizeusinggradientmethods(SGD/Adam)– BayesbyBackprop (Blundelletal.2015),PracticalVI(Gravesetal.2011),

Black-boxVI(Rangnathan etal.2014)andmanymore….

Computationandmemoryintensive,andrequiresubstantialimplementationeffort

Page 10: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationof(Approximate)UncertaintyApproximatebyaGaussiandistribution,

andfinditby“perturbing”theparametersduringbackpropagation

10

Page 11: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertainty

11

1. Selectaminibatch2. Computegradientusingbackpropagation3. Computeascalevectortoadaptthelearningrate4. Takeagradientstep

✓ ✓ + learning rate ⇤ gradientpscale + 10�8

Adaptivelearning-ratemethod(e.g.,Adam)

NY

i=1

p(yi|f✓(xi)) ✓ ⇠ N (✓|0, I)

Page 12: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

FastComputationofUncertainty

12

1. Selectaminibatch2. Computegradientusingbackpropagation3. Computeascalevectortoadaptthelearningrate4. Takeagradientstep

0.Sample𝜖 fromastandardnormaldistribution

NY

i=1

p(yi|f✓(xi))

✓temp ✓ + ✏ ⇤pN ⇤ scale + 1

✓ ✓ + learning rate ⇤ gradient + ✓/Npscale + 1/N

✓ ⇠ N (✓|0, I)

Variational Adam(Vadam)

Page 13: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Illustration:Classification

13

Logisticregression(30datapoints,2dimensionalinput).

SampledfromGaussianmixturewith2components

Page 14: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

AdamvsVadam

14

Forbothalgorithms,Minibatch of5

Learning_rate =0.01Priorprecision=0.01

Adam

Vadam (mean)

Vadam (samples)

Page 15: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Whydoesthiswork?

• Thisalgorithmisobtainedbyreplacing“gradients”by“naturalgradients”.– SeeourICML2018paper.

• ThescalinginnaturalgradientisrelatedtothescalinginNewtonmethod.

• AnapproximationtotheHessianresultsinAdam.

• Somecaveats:Choosesmallminibatches,betterresultsareobtainedwithVOGN.

15

Page 16: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Faster,Simpler,andMoreRobustRegressiononAustralian-Scaledatasetusingdeepneuralnetsforvariousnumberofminibatch size.

16

ExistingMethod(BBVI)Ourmethod(Vadam)Ourmethod(VOGN)

Page 17: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Faster,Simpler,andMoreRobust

17

ResultsonMNISTdigitclassification(forvariousvaluesofGaussianpriorprecisionparameterλ)

ExistingMethod(BBVI)Ourmethod(Vadam)

Page 18: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

DeepReinforcementLearning

18

NoExploration(SGD)Reward=2860

ExplorationusingVadamReward=5264

Page 19: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

ReduceOverfittingwithVadam

19

Vadam showsconsistenttrain-testperformance,whileAdamoverfits whenNissmall

BNNclassificationona1a- a9adatasets

AdamTestAdamTest

AdamTrain AdamTrain

AdamTest

AdamTrain

Vadam Testandtrain Vadam Testandtrain

Vadam Testandtrain AdamTest

AdamTrain

Page 20: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

20

AvoidingLocal

MinimaAnexampletakenfromCasellaand

Robert’sbook.

Vadam reachestheflat

minima,butGDgetsstuckatalocalminima.

Optimizationbysmoothing,Gaussianhomotopy/blurringetc.,EntropySGLDetc.

Page 21: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Summary

• Uncertaintyisimportant,especiallywhenthedataisscarce,missing,unreliableetc.

• Wecanobtainuncertaintycheaplywithverylittleeffort– Bayesiandeeplearning

• Itworksreasonablywellonourbenchmarks.

21

Page 22: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

OpenQuestions

• Qualityofuncertaintyestimates– Applicationtolifescience?– Checkoutthe“Bayesiandeeplearning”workshopatNIPS2018.

• Estimatingvarioustypesofuncertainty–Modeluncertaintyvsdatauncertainty– Applicationsplayabigrolehere

• Isuncertaintyindeeplearninguseful?–Multiplelocalminimamakeitdifficulttoestablish

22

Page 23: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

References

23

https://emtiyaz.github.io

Page 24: Fast Computation of Uncertainty in Deep Learning · YarinGal (University of Oxford), Akash Srivastava (University of Edinburgh) ZuozhuLiu (SUTD, Singapore) Uncertainty Quantifies

Thanks!

https://emtiyaz.github.io

24


Top Related