Deluceva:Delta-BasedNeuralNetworkInferenceforFastVideoAnalytics
Jingjing WangandMagdalenaBalazinskaUniversity of Washington
1
Deluceva:Delta-BasedNeuralNetworkInferenceforFastVideoAnalytics
• Largevolumeofimages/videoswithvaluableinformation• Alargesetofneuralnetworkmodelsforimages• Objectclassification,detection,…
• Nextstep:videoanalytics• Largervolume• Efficiencycritical,liveoutput
2
VideoAnalyticsUsingNeuralNetworks
3
… …
DeepNeuralNetworkRichsetofestablishedimagemodels
KeyObservation:TemporalRedundancy
4
- =
ProcessDeltasInsteadofFullFrames
5
… …
State
Query
“IncrementalQueryEvaluation”
ProcessDeltasInsteadofFullFrames
6
DataStream
State
Query
“IncrementalQueryEvaluation”
…newboatdiscoveredwithboundingboxB
…
…pixel[x,y,z]has
changedby(0,0,1)…
Delta-BasedInferenceforVideos
• Problem:• Input: a videostream,areferencemodel• Output: similartothereferencemodel’sresult
• Approach:• Acceleratemodelinferencebyperforminglesscomputation
7
Delta-BasedInferenceforVideos:Overview
• Modifyneuralnetworktotakedeltasasinputs• Decidewhichdeltasaresignificantenoughtoprocess• Generate a network of mixed-type (dense or delta-based) operators
8
Delta-BasedInferenceforVideos:Overview
• Modifyneuralnetworktotakedeltasasinputs• Decidewhichdeltasaresignificantenoughtoprocess• Generate a network of mixed-type (dense or delta-based) operators
9
Frame2– Frame1
Frame1
Neural Network with SparseDeltas
10
=
+
Model(original)
Frame2
Model(delta)
Model(delta)
ExampleNeuralNetwork Model
11
Model(original)
OpOp ……
• Exampleoperators:convolution,maxpooling,ReLU,…
ExampleNeuralNetwork Model
12
Model(delta)
OpOp ……
• Modifyoperatorstooperateondeltas
DeltaOperatorUnit
FilterOperator
DeltaOperatorUnit
• Sparseoperator:takessparsedeltas&outputsdelta• Saves#ofFLOPsbyprocessingdeltascalarsonly
• Filter:sendonlysignificantdeltastotheoperator• Buildshistogram,keepssmalldeltas&outputslargedeltas
13
Indices,Values
abs
Accum.deltas
Kernel
・ +
SparseHow much to filterforthecurrentframe?
Delta-BasedInferenceforVideos
• Modifyneuralnetworktotakedeltasasinputs• Decidewhichdeltasaresignificantenoughtoprocess• Generate a network of mixed-type (dense or delta-based) operators
14
DeltaImpactsOutputQuality
15
Conv …
Conv …
Frame1 Frame2
Frame2– Frame1
Groundtruths:
1s
5s
90%
20%
DynamicFilteringPercentage
16
• Filteringpercentage• Higherisfasterbutrisky• Lowerissafer butslower
• Targetfilteringpercentage:largestpercentagethatgeneratesgoodresult• Appliestoalloperators• Two approaches: PI controller / Machine learning
Safer,slower Risky,faster
abs(delta)
90%
Delta-BasedInferenceforVideos
• Modifyneuralnetworktotakedeltasasinputs• Decidewhichdeltasaresignificantenoughtoprocess• Generate a network of mixed-type (dense or delta-based) operators
17
MixedNetwork
18
OpOp ……
OpOp ……
Filteringpercentageis:low(e.g.0)high(e.g.99)
medium(e.g.50)
OpOp ……
DeltaOpUnit
MixedNetwork• Logical plan: a DAG of operators• Physical plan: choose between delta opunit/ originaldenseimplementation• Profileeachoperatorwithdifferentfilteringpercentages• Pickthefastervariant
19
Op2(sparse)Filter
Op2(dense)Filteringpercentage=99
Op1… …Op3
Evaluation:Setup
20
• Threeobjectdetectionmodels
• Six10-minutevideosfromthreeYouTubelivestreams– Takenatdifferenttimes(e.g.day/night)for eachstream– Typicalobjects:people,cars,buses,boats,…– One frame per second
• TensorFlow,oneCPUthread, AmazonEC2r3.2xlarge
Model Abbrv. #ofFLOPs
SSD-VGG16 ssd-vgg 123B
FRCNN-RESNET101 frcnn-res 550B
FRCNN-INCEPTION-RESNET-V2 frcnn-incep 1395B
Time
3s
16s
41s
Evaluation:End-to-EndComparison
21
• HighestruntimesavingsbyPIcontroller– Whenerrorlessthanathreshold
Model / Datasetssd-vgg
je jn kd kn vd vnfrcnn-res
je jn kd kn vd vnfrcnn-incep
je jn kd kn vd vn
0
10
20
30
40
50
60
Avg.
Pct
. of S
avin
g
videos
Deluceva:Conclusion
• Observerichtemporalredundancyinvideos• Acceleratemodelinferencebyprocessingsignificantdeltasonly
• ModifyNNmodelstoconsistofsparse&denseops• Adjustthefilteringgranularityadaptively• Generateanetwork of mixed-type operatorsbasedoncostmodels
• Improveruntimeupto67%with lowerror• Appliestoconvolutionalneuralnetworkmodels
• Ongoing:GPUimplementation,comparetootherwork
22