+ All Categories
Home > Documents > Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Date post: 25-Feb-2016
Category:
Upload: harvey
View: 110 times
Download: 1 times
Share this document with a friend
Description:
Symbolic Execution for Software Testing in Practice – Preliminary Assessment. Cristian Cadar , Patrice Godefroid , Sarfraz Khurshid , Corina Pasareanu , Koushik Sen , Nikolai Tillmann , Willem Visser. Overview. S ymbolic execution and its variants Generalized symbolic execution - PowerPoint PPT Presentation
26
Symbolic Execution for Software Testing in Practice – Preliminary Assessment Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina Pasareanu, Koushik Sen, Nikolai Tillmann, Willem Visser
Transcript
Page 1: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Cristian Cadar, Patrice Godefroid, Sarfraz Khurshid, Corina Pasareanu, Koushik Sen, Nikolai Tillmann, Willem Visser

Page 2: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

OverviewSymbolic execution and its variants

Generalized symbolic execution Dynamic test generation

Tools and impactSymbolic PathFinderDART, CUTE, jCUTE, CREST, SAGEEXE and KLEE

Challenges

Page 3: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Symbolic Execution King [Comm. ACM 1976] , Clarke [IEEE TSE 1976] Received renewed interest in recent years

Algorithmic advances Increased availability of computational power and decision procedures

Tools, many open-source NASA’s Symbolic (Java) Pathfinder http://babelfish.arc.nasa.gov/trac/jpf/wiki/projects/jpf-symbc UIUC’s CUTE and jCUTE http://osl.cs.uiuc.edu/~ksen/cute Stanford’s KLEE http://klee.llvm.org/ UC Berkeley’s CREST and BitBlaze http://code.google.com/p/crest Microsoft’s Pex, SAGE, YOGI, PREfix http://research.microsoft.com/en-us/projects/pex/ http://research.microsoft.com/en-us/projects/yogi IBM’s Apollo, Parasoft’s testing tools etc.

Page 4: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Symbolic Execution Analysis of programs with unspecified inputs Execute a program on symbolic inputs Symbolic states represent sets of concrete states For each path, build a path condition

Condition on inputs for the execution to follow that path Check path condition satisfiability –> explore only feasible paths

Symbolic state Symbolic values/expressions for variables Path condition Program counter

Page 5: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

x = 1, y = 0

1 > 0 ? true

x = 1 + 0 = 1

y = 1 – 0 = 1

x = 1 – 1 = 0

0 > 1 ? false

int x, y;

if (x > y) {

x = x + y;

y = x – y;

x = x – y;

if (x > y)

assert false;

}

Concrete Execution PathCode that swaps 2 integers

Example: Standard Execution

Page 6: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

[PC:true]x = X,y = Y

[PC:true] X > Y ?

[PC:X>Y]y = X+Y–Y = X

[PC:X>Y]x = X+Y–X = Y

[PC:X>Y]Y>X ?

int x, y;

if (x > y) {

x = x + y;

y = x – y;

x = x – y;

if (x > y)

assert false;

}

Code that swaps 2 integers: Symbolic Execution Tree:

[PC:X≤Y]END [PC:X>Y]x= X+Yfalse true

[PC:X>YY≤X]END [PC:X>YY>X]ENDfalse true

path condition

False!

Solve path conditions → test inputs

Example: Symbolic Execution

Page 7: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Generalized Symbolic Execution “Classical” symbolic execution

Handles sequential programs with primitive typed inputs

Generalized symbolic execution [TACAS’03] Handles dynamically allocated data structures and multi-threading

Key elements: Lazy initialization for input data structures Standard model checker (Java PathFinder) for multi-threading

Model Checker: Analyzes thread inter-leavings

Optimizations (symmetry and partial order reductions, abstraction, heuristic search etc.) Generates and explores the symbolic execution tree Explores different heap configurations explicitly -- non-determinism handles aliasing

Loops and recursion: Put bound on search depth Stop search when desired coverage achieved

Page 8: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Implements a non-standard interpreter of byte-codes Enables JPF to perform symbolic analysis Replaces standard byte-code execution with non-standard symbolic

execution Symbolic information:

Stored in attributes associated with the program data Propagated dynamically during symbolic execution

Choice generators and listeners: Non-deterministic choices handle branching conditions Listeners print results (path conditions, test vectors/sequences)

Native peers model native libraries: Capture Math calls and send them to the constraint solver

Generic interface for multiple decision procedures Choco, IASolver, CVC3, Yices, HAMPI, CORAL [NFM11], etc.

Symbolic PathFinder (SPF)

Page 9: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

SPF’s Impact

NASA Test case generation for Orion control software Fault tolerant protocols NextGen (TSAFE) aviation software Plexil robot executive …

Fujitsu Extended with String analysis Parallel version put on the cloud

Open sourced:http://babelfish.arc.nasa.gov/trac/jpf/wiki/projects/jpf-symbc Used at many universities, industry (IBM [ICSE11]) etc… Largest application 60KLOC

Page 10: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Testing the Onboard Abort Executive (OAE)Prototype for CEV ascent abort handling being developed by JSC

GN&C

Inputs

Pick Highest Ranked Abort

Checks Flight Rules to see if an abort must

occurSelect Feasible Aborts

OAE Structure

Results• Baseline

– Manual testing: time consuming (~1 week)– Guided random testing could not cover all

aborts• Symbolic PathFinder

– Generates tests to cover all aborts and flight rules

– Total execution time is < 1 min– Test cases: 151 (some combinations infeasible) – Errors: 1 (flight rules broken but no abort

picked)– Found major bug in new version of OAE– Flight Rules: 27 / 27 covered – Aborts: 7 / 7 covered– Size of input data: 27 values per test case

• Integration with End-to-end Simulation– Input data constrained by physical laws

Example: inertial velocity can not be 24000 ft/s when the geodetic altitude is 0 ft

– Need to encode these constraints explicitly

[ISSTA’08]

Page 11: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Generated Test Cases and ConstraintsTest cases:

// Covers Rule: FR A_2_A_2_B_1: Low Pressure Oxodizer Turbopump speed limit exceeded// Output: Abort:IBBCaseNum 1;CaseLine in.stage_speed=3621.0;CaseTime 57.0-102.0;

// Covers Rule: FR A_2_A_2_A: Fuel injector pressure limit exceeded // Output: Abort:IBBCaseNum 3;CaseLine in.stage_pres=4301.0;CaseTime 57.0-102.0;…

Constraints://Rule: FR A_2_A_1_A: stage1 engine chamber pressure limit exceeded Abort:IAPC (~60 constraints):in.geod_alt(9000) < 120000 && in.geod_alt(9000) < 38000 && in.geod_alt(9000) < 10000 && in.pres_rate(-2) >= -2 && in.pres_rate(-2) >= -15 &&in.roll_rate(40) <= 50 && in.yaw_rate(31) <= 41 && in.pitch_rate(70) <= 100 && …

Page 12: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Shown: Polyglot Framework for model-based analysis and test case-generation; test cases used to test the generated code and to discover discrepancies between models and code.

Orion orbits the moon (Image Credit: Lockheed Martin).

Polyglot Framework [ISSTA’11]– Analysis for UML, Stateflow and Rhapsody

interactive models – Automated test sequence generation– High degree of coverage (state, transition,

path, MC/DC)– Pluggable semantics– Study discrepancies between multiple

statechart formalisms

Demonstrations: – Orion’s Pad Abort--1 – Ares-Orion communication– JPL’s MER Arbiter

Test-Sequence Generation for Multiple Statechart Models

Page 13: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Distributed symbolic execution over cloud Adaptive dynamic partitioning Fujitsu’s technology uses heuristics to partition jobs on the fly based on

system resources and job characteristics and history Close to linear speed-up is possible in > 90% of the cases even if symbolic

tree highly skewed

Fujitsu applications

Scheduler Node

Worker Nodes

N1 N2 N3 N4

Job Queue

J1 J2 J3 J4 J5

statusjobs

Available Resource List

N3 N4

Initialization Path Condition

New Jobs

Computation at this node

Termination Path Condition

Page 14: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

String solver Interactive hybrid constraints:

string operations can result in numeric values and vice-versae.g. s.length(), s.indexOf(‘x’), a.toString()

some string operations have numeric inputse.g. s.subString(5) ,

e.g.: string s, q; integer a, b;(s.equals(q)) && (s.startswith(“uvw”)) && (q.endswith(“xyz”) ) && (s.length() < a) && ((a+b) < 6) && (b > 0)

Unsatisfiable !!

• Fujitsu solution• Maintain separate constraint set for Integer/Boolean and Real – represented as equations• Maintain separate constraint set for string variables – represented as FSMs or regular expressions• Introduce length of each string variable as an integer variable in the numeric constraints• Introduce other String related numeric variables in numeric constraints if any• Pass learned constraints from one domain to another and iterate to fixed point or time out

Fujitsu applications

Fujitsu technology can handle symbolic execution and automatic test case generation for web applications which uses String input variables extensively

Page 15: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Dynamic Techniques

Classic symbolic execution is a static techniqueDynamic techniques

Collect symbolic constraints during concrete executions

DART = Directed Automated Random TestingConcolic (Concrete Symbolic) testing

P. Godefroid

Page 16: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

DART: Directed Automated Random Testing [PLDI’05]

1. Automated extraction of program interface from source code2. Generation of test driver for random testing through the

interface3. Dynamic test generation to direct executions along alternative

program paths Together: (1)+(2)+(3) = DART DART can detect program crashes and assertion violations. Any program that compiles can be run and tested this way:

No need to write any test driver or harness code! (Pre- and post-conditions can be added to generated test-

driver)

Page 17: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Directed Search Dynamic test generation to direct executions along alternative program

paths collect symbolic constraints at branch points (whenever possible) negate one constraint at a branch point to take other branch (say b) call constraint solver with new path constraint to generate new test inputs next execution driven by these new test inputs to take alternative branch b check with dynamic instrumentation that branch b is indeed taken

Repeat this process until all execution paths are covered May never terminate!

Significantly improves code coverage vs. pure random testing

Dynamic test generation (Korel, Gupta-Mathur-Soffa, etc.) Attempt to exercise a specific program path DART attempts to cover all executable program paths instead (like Model

Checking)

Page 18: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

White-box Fuzzing [NDSS’08] White-box Fuzzing = “DART meets Fuzz”

Black-box Fuzzing = randomly “fuzz”(modify) a well-formed input; simple but effective

Apply DART to large applications (not unit) Binary level Thousands of inputs, millions of instructions

Start with a well-formed input (not random) Combine with a generational search (not DFS)

Negate 1-by-1 each constraint in a path constraint Generate many children for each parent run Challenge all the layers of the application sooner Leverage expensive symbolic execution

Search spaces are huge, the search is partial… yet effective at finding bugs !

Gen 1parent

Page 19: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

SAGESAGE found many new expensive security bugs in

Windows applicationsCost of each Microsoft Security Bulletin: $MillionsCost due to worms (Slammer, CodeRed, Blaster, etc.):$Billions

Apps: image processors, media players, file decoders,…Many bugs triaged as “security critical, severity 1,

priority 1” (would trigger Microsoft security bulletin if known outside MS)

Bugs missed by black-box fuzzers or static analysisUsed daily in various Microsoft groups

Page 20: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

CUTE, jCUTE, CREST, PEXCUTE (for C) and jCUTE (for Java)

Extend DART to handle multi-threading programs with dynamic data structures

Pointer constraints and dynamic partial order reduction

CREST is a new extensible open source tool that performs dynamic testing for C

PEX is Microsoft’s dynamic testing tool for .NET code

Many, many other tools …

Page 21: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

PEX Pex is a Visual Studio 2010 Power Tool

http://msdn.microsoft.com/en-us/vstudio/bb980963.aspx Power Tools are a set of enhancements, tools and command-line

utilities Used by several groups within Microsoft Externally, available under academic and commercial licenses Downloaded > 40,000 times Anyone can try out Pex in the browser

http://pexforfun.com > 250,000 programs analyzed within the first 5 months of the

launch of the website

Page 22: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

EXE and KLEESymbolic execution tools for C:Perform mixed symbolic/concrete executionModel memory with bit-level accuracy

Systems code often treats memory as untyped bytes and observes a single memory location in multiple ways

Employ various constraint-solver optimizations, in addition to those implemented in the STP solver: Irrelevant constraint elimination, cex caching, etc.

Use search heuristics to get high-coverageCan interact with the external environment (KLEE)

Page 23: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

UNIX file systems ext2, ext3, JFSUNIX utilities Coreutils, Busybox, MinixMINIX device drivers pci, lance, sb16Library code PCRE, uClibc, PintosPacket filters FreeBSD BPF, Linux BPFNetworking servers udhcpd, Bonjour, Avahi, WsMp3Operating Systems HiStar kernel

OpenCVComputer vision code

EXE and KLEETargeted at low-level systems code. Found bugs (including security vulnerabilities) in:

Page 24: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

KLEE

Open-sourced in June 2009Lots of different users, from both academia and industry

170 members on the mailing list (May 2011)

Extended in many interesting ways by several research groups in the areas of: wireless sensor networks schedule memoization in multithreaded code automated debugging online gaming exploit generation, etc.

http://klee.llvm.org

Page 25: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Scalability Compositional techniques [Godefroid, POPL’07] Pruning redundant paths [Boonstoppel et al, TACAS’08] Heuristic search [Brunim & Sen, ASE’08] [Majumdar & Se, ICSE’07] Parallel techniques [Siddiqui & Khurshid, ICSTE’10] [Staats & Pasareanu, ISSTA’10] Incremental techniques [Person et al, PLDI’11]

Complex non-linear mathematical constraints Un-decidable or hard to solve Heuristic solving [Lakhotia et al., ICTSS’10][Souza et al, NFM’11]

Testing web applications and security problems String constraints [Bjorner et al, 2009] … Mixed numeric and string constraints

Not covered: Symbolic execution for formal verification [Coen-Porisini et al, ESEC/FSE’01], [Dillon, ACM TOPLAS’90], [Harrison & Kemmerer’88]…

Challenges

Page 26: Symbolic Execution for Software Testing in Practice – Preliminary Assessment

Thank you!


Recommended