MyCollaborators• CMU– PeterChapman;nowatDuolingo– DavidBrumley
• UniversityofIowa– AndrewReynolds– TianyiLiang;nowatTwoSigma– CesareTinelli
• StanfordUniversity– ClarkBarrett
WealsoacknowledgetherestoftheCVC4developerteamandThomasBall(MSR)fortheiradvice!
2
PythonisImportant• Pythoniseverywhere– Dataanalysis:NumPy,Pandas,Matplotlib,…– Machinelearning:Scikit-Learn,TensorFlow,…– Webapplications:Django,Tornado,…
• Pythondevelopmentenvironmentismature– PyCharm– Emacs,Vimetc.withsuitableplugins
3
“IntelliSense”inMSlingo
IntelliTestforC#(VS2015)
4https://msdn.microsoft.com/en-us/library/dn823749.aspx
IntelliTestforC#(VS2015)
5https://msdn.microsoft.com/en-us/library/dn823749.aspx
DrivenbyDynamicSymbolicExecution
DynamicSymbolicExecu.on“Theinstrumentedexecutionsofaprogramwhere• eachconcreteload/storeofavariableisaccompaniedwith
• acorrespondingsymbolicdereference/assignmentofthatvariable’sexpression”
ConsiderthisPythonprogram:1.x=int(input(“Num?”))2.y=x+42
6
Example
ConcreteExecution1. xß10/let’ssay/2. yß52
SymbolicExecution1. x_1==sym_int()2. y_2==plus(x_1,42)
7
1.x=int(input(“Num?”))2.y=x+42
Logicconstraints:Equalityforassignment;Inequalityforif-then-else
Sa.sfiabilityModuloTheorySolver
8
DynamicSymbolicExecutor
9
1.x=int(input(“Num?”))2.y=x+42
Automaticgenerationofthesefromprogram
Moresophisticatedcontrollogictoexploreprogram
DSEcomprises:
IntelliTestFacilitatesMaintenance
ConcreteExecution1. xß10/let’sjustsay/2. yß52
SymbolicExecution1. x_1==sym_int()2. y_2==plus(x_1,427)
10
1.x=int(input(“Num?”))2.y=x+427
Previousmodel{x_1ß–42,y_2ß0}nolongersatiseiesthenewsetofgeneratedconstraints—wehavemadearegressionorafunctionalchange!
Let’sBuild“IntelliTestforPython”DynamicSymbolicExecutionforPython• StartedinFall2014,ledbyPeterChapman• UsesthenewstringreasoningintheCVC4SMTsolver• InitiallysponsoredbyNSFSecureandTrustworthyCyberspace(SaTC)
“ThisbeingCyLab,wheredoessecuritycomein?”• DSEaffordstestcasegeneration• DSEaffordsexploitgenerationJ– “Generateaninputthat(i)drivestheexecutiontoavulnerablepartoftheprogramand(ii)triggersthatbug”
11
TestHarnessforstr.substringfromsymbolic.argsimportsymbolic@symbolic(s="foo")defstrsubstring(s):"""TestcaseforPythonslicing,negativeindicesandstepsarenotcurrentlytested."""ifs[2:]=="obar":return0elifs[:2]=="bb":return1elifs[1:3]=="bb":return2else:return3defexpected_result():return[0,1,2,3]
12
$pyex--cvcstrsubstring.pyExploring./test/cvc/strsubstring.py.strsubstring[('s','foo')]3[('s','Abb')]2[('s','bb')]1[('s','AAobar')]0strsubstringtestpassed<---Executiontime:0.43secondsSolverCPU:0.06secondsInstrumentationCPU:0.08secondsPathcoverage:4pathsLinecoverage:10/11lines(90.91%)Branchcoverage:15branchesExceptions:0exceptionsraisedTriagedexceptions:0triagedexceptionsraised
13
TheRestofThisTalk• Whathasbeenachievedsofar?• Whatarethecurrentobstacles?• Whatareourideastoovercomethem?
• WhatcanyoudotospeedupDSEresearch?
14
ToBuildaDSEforPython1.WeneedthesemanticsofPython
[1]G.J.Smeding,“AnExecutableOperationalSemanticsforPython,”UniversiteitUtrecht,2009.
[2]J.G.Politz,A.Martinez,M.Milano,S.Warren,D.Patterson,J.Li,A.Chitipothu,andS.Krishnamurthi,“Python:TheFullMonty-ATestedSemanticsforthePythonProgrammingLanguage,”inProceedingsofthe2013ACMInternationalConferenceonObjectOrientedProgrammingSystemsLanguages&Applications,2013,pp.217–232.
[3]S.Sapra,M.Minea,S.Chaki,A.Gureinkel,andE.M.Clarke,“FindingErrorsinPythonProgramsUsingDynamicSymbolicExecution,”inProceedingsoftheInternationalConferenceonTestingSoftwareandSystems,2013,pp.283–289.
[4]T.BallandJ.Daniel,“DeconstructingDynamicSymbolicExecution,”inProceedingsofthe2014MarktoberdorfSummerSchoolonDependableSoftwareSystemsEngineering,2014.
15
BuiltontopofBall-Daniel
ToBuildaDSEforPython1. ImplementthesemanticsofPython
Pythonhasmanyversions!• ImplementationstrategytakenbyBall-Daniel2014cancopewithlanguagechanges– DivergencedetectionisamustforanyDSEanyway
• ButNOTwithlibrarychanges,e.g.,– What’sNewinPython3.7:“bytes.fromhex()andbytearray.fromhex()nowignoreallASCIIwhitespace,notonlyspaces.(ContributedbyRobertXiaoinbpo-28927.)”
16
ToBuildaDSEforPython1. ImplementthesemanticsofPythonForeachstringfunction,weneedtomanuallycodeitssemanticstolow-levelstringoperationssupportedbyCVC4:
18
Dispatchbyversion
ToBuildaDSEforPython2.ReasonwithPythondatatypes
Pythonhasmanybuilt-intypes:• Numeric:int,eloat,complex• Sequence:list,tuple,range• Textsequence:str• Settypes:set,frozenset• Maptypes:dict• …
19
Reasoningw/stringsisasignieicant
4-yearNSFproject
ToBuildaDSEforPython2.ReasonwithPythondatatypes
Stringreasoningismuchharderthannumericalreasoning:• Givemeastringthat(i)doesnotstartwith“e”,(ii)haslength8,(iii)contains“eric”and“mark”assubsequences,and(iv)doesnotcontainmorethanone“r”– Therearelotsofstringswithlength8:28*8
20
ToBuildaDSEforPython2.ReasonwithPythondatatypesThisisafragmentoftheCVC4logicforstringreasoning:
21
Thisiswhywehaveworld-classlogiciansinourteam!
ToBuildaDSEforPython2.ReasonwithPythondatatypesThisishowwesimplifylogicstatementsinthisfragment:
22
Thisiswhywehaveworld-classlogiciansinourteam!
AdvancesinCVC4:2017Edi.onNewsimplieierin~2000linesofC++,plusmanymoreimprovementsintherestofthesolver:A.Reynolds,M.Woo,C.Barrett,D.Brumley,T.Liang,andC.Tinelli,“ScalingUpDPLL(T)StringSolversUsingContext-DependentSimpliJication,”inProceedingsofthe29thInternationalConferenceonComputerAidedVerieication,Springer,2017,pp.453–474.
23
Previousstate-of-the-artinliterature
NewCVC4iswaybetter
Canstilltakehours
HighPriorityTODO—Logic• “High-FidelityPython”regex– CVC4has“textbookregex”:+,|,*– Pythonregexsupportsbackreferences+namedgroups(easy)andnon-greedycaptures(hard,orisit?)
• Symbolicdictionary– Adictionaryisafunctionmappingkeystovalues– Efeicientmodel-eindingwhenkeysandvaluesareconcrete:seepapersonfunctionsynthesis
– Whenthekeysandvaluescanbesymbolicallyspecieied,itstressessolvers“togowherenoonehasgonebefore”
24
HighPriorityTODO—Systems• Effectivehorizontalscaling– Ball-Daniel2014isaprototypedesignedforteachingatasummerschool
– RealchallengewithDSEisscalingacrossmultiplemachinesinacluster/datacenter(manyPhDtheseshaveyettobewritten)
• IntegrationwithPyCharm/Emacs/Vim– MatchtherealIntelliTestexperience– Persistgeneratedtest-casesforthedeveloper
25
CalltoAc.on• IfyoubelieveDynamicSymbolicExecutionforPythonisworthpursuing,pleasetalkwithMichaelLisantiorme– Partnersat100Korabovegettopicktheirfavoriteprojectstosupport
– Weacknowledgeyourorganizationinthepaperandintheconferencepresentation—exposure==>advantagesinhiring!
– Wecanintegrateyourengineersintoourtoolstudy
26