NEUZZ:EfficientFuzzingwithNeuralProgramSmoothing
DongdongShe,KexinPei,DaveEpstein,JunfengYang,BaishakhiRay,andSumanJanaColumbiaUniversity
1
Fuzzing:apopularwaytouncoverbugs
2
[Liang et al. 2019]
EvolutionaryFuzzing
3
Advantage:easytoimplementDisadvantage:inefficient
• Randommutationsarenoteffective• Oftengetstuckinlongsequenceof
wastefulmutations
Mutation
Hardtofindscalableandadaptiveheuristicsforguidedmutation
Seed
Children
Grandchildren
Anewapproachtofuzzing
4
Fuzzing:AnOptimizationProblem
5
aprograminput#ofbugsfoundbyinputgenerateKinputsfrominputspace
Maximize
isdiscreteandhardtooptimize
FindC(X)thatcanmaximizetotalno.ofbugs
F (x)C(X) X
x x
x
!
x∈C(X)
F (x)
∈X
F (x)
6
Fuzzing:AnOptimizationProblem
:#ofbugs
Input
Hardtofindinputslikeandamongflatplateaus
F (x)
xx1 x2
x1 x2
Fuzzing:AnOptimizationProblem
7
aprograminputedgecoverageofinput
generateKinputsfrominputspace
Maximize
FindC(X)thatcanmaximizetotalnumberofedges
C(X) X
x x
x
∈X
G(x)
!
x∈C(X)
G(x)
Input
8
Fuzzing:AnOptimizationProblem
:#ofedges
x
G(x)
Input
9
Evolutionaryoptimization
x
1
2
3
4
5
Randommutationisnotefficient
:#ofedgesG(x)
Input
10
Gradient-guidedOptimization
:#ofedges
x
SmoothApproximation+Gradient-guidedMutation
G(x)H(x):smoothapproximationofG(x)
:smoothapproximationof
Input
11
Gradient-guidedOptimization
x
SmoothApproximation+Gradient-guidedMutation
H(x) G(x)
1
2
3
4 5
SmoothApproximation
Problem:HowtosmoothlyapproximateG(x)?
NeuzzSolution:UseaNNtolearnasmoothH(x)
UniversalApproximationTheorem:ANNcanapproximateanycontinuousfunction
12
Gradient-guidedMutation
13
Whygradientguidance?Gradientindicatescriticalpartsofinput
Whatarecriticalpartsoftheinput?Criticalpartsofinputaffectprogrambranches
Howgradient-guidedmutationworks?Focusmutationsonthecriticalpartsoftheinput
MainIdeabehindNeuzz
14
Input Branching Behaviors
Program
NN
Gradient-guided mutation Smooth
Surrogate
Input Branching Behaviors
APeekIntoNNModel
15
GeneralizationtoUnseenbranches
Observations:- Realworldprograminputshavecriticalparts- MostofbranchesareaffectedbythecriticalpartsNeuzzSolution:- Identifycriticalpartsbasedonobservedbranches- Performmoremutationsonthecriticalpartof
inputstoexploreunseenbranches
16
DesignofNEUZZ
17
Evaluation
Ø 10realworldprogramsØ Lava-MandDARPACGCdatasetsØ ComparisonwithRNN-basedfuzzersØ Performanceofdifferentmodelchoices
18
Evaluations:EdgeCoverageNEUZZvs.state-of-the-artfuzzers 10realworldapplicationsfor24hours
NEUZZachievesonaverage3xmoreedgecoveragethanotherfuzzers
19
Evaluations:BugFindingNEUZZvs.state-of-the-artfuzzers
NEUZZfindsthemostnumberofbugsandall5bugtypesincludingtwo
newCVEs
20
Evaluations:Lava-MandCGC
21
NEUZZoutperformsstate-of-the-artfuzzersonLAVA-MandCGC
Lava-Mdataset DARPACGCdataset
Evaluations:NEUZZvs.RNN-basedFuzzer
NEUZZachieves6xmoreedgecoverageand20xlesstrainingtime
22
Evaluations:EffectofDifferentNNs
23
NEUZZachievesbestperformancewithNN+Incremetallearning
Edgecoveragefor1Mmutations
KeyTakeawaysofNEUZZ
● UseNNgradientstoidentifythecriticallocationsofprograminputs
● Focusmutationsonthecriticallocations● Minimizeruntimeoverheadbyusingsimplefeed-forward
neuralnetworks● Retrainthenetworkincrementallytofindnewcritical
locations
24
GithubRepo
NEUZZisavailableathttps://github.com/Dongdongshe/neuzz
25
NEUZZ:EfficientFuzzingwithNeuralProgramSmoothing
DongdongShe,KexinPei,DaveEpstein,JunfengYang,BaishakhiRay,andSumanJanaColumbiaUniversity
26