Realization of solver based techniques for Dynamic Software Verification
Andreas S ScherbakovIntel Corporation
What’s program testing here?• The problem: to test a programmeans• Find at least one set of input values such that– a crash/an Illegal operation occur
or– some user defined property has violated (unexpected
results/behaviour)or• Prove correctness of the program– or at least demonstrate that it’s correct with some high
probability
SW testing: basic approaches• Random testing
-> You execute your program repeatedly with random input values..+ covers a lot of unpredictable cases─ too much redundant iterations -> out of resources
• “Traditional “ testing - Custom test suites-> You know you code and therefore you can create necessary examples to test it?..
+ targets known critical points─ misses most of unusual use cases─ large effort, requires intimate knowledge of the code
• Directed testing-> Try to get a significantly different run each attempt..+ explores execution alternatives rapidly+ effective for mixed whitebox/blackbox code─ usually needs some collateral code─ takes large resources if poorly optimized
SW testing: basic approaches - 2• Static Analysis
Commercial tools: Coverity, Klocwork, …─ Find dumb bugs, not application logic errors─ Finds some “false positive” bugs, misses many real bugs+ Good performance+ Little expertise required
• Model Checking Academic and competitor tools: BLAST, CBMC, SLAM/SDV+ Finds application logic errors─ Finds some “false positive” bugs, but doesn’t miss any real ones─ Significant user expertise required
• Formal Verification Academic tools: HOL, Isabelle, …+ Ultimate guarantee: proves conformance with specification─ Scaling constraint is human effort, not machine time─ Ultimate user expertise required: multiple FV PhDs
Directed Testing:as few runs as possible
• executes the program with two test cases: i=0 and i=5
• 100% branch coverage
DART: Directed Automated Random Testing
• Main idea has been proposeded in Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART:
Directed Automated Random Testing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI 2005: 213-223.
• Dependent upon a Satisfiability Modulo Theories (SMT) solvers-> SMT solvers are applications able to solve equation sets.
A theory here implies methods related to some set of allowed data types/operands
7
What does it check?
• Does not verify the correctness of the program UNLESS YOU HAVE Express the meaning of CORRECTNESS in form of ASSERTION CHECKERs– Can not infer what the ‘correct’ behavior of the program is
• What does it check– allows users to add assumptions to limit the search space and assertions
(‘ensure‘) to define the expected behavior. – Assertions are treated as (‘bad’) branches – so test process will try to reach
them, or formally verify it is impossible.– ‘built in’ checks for crashes, divide by 0, memory corruption
• requires some familiarity with the software under test for effectiveness.
Looking for a Bug in a Program• A bug is a like a snark• A program is like a forest with many
paths• Source code is like a map of the
forest
Just the place for a Snark! I have said it twice:
That alone should encourage the crew.Just the place for a Snark!I have said it thrice:
What I tell you three times is true.
The Hunting of the SnarkLewis Carroll
Looking for a Snark in a Forest
Proof Rather than Snark Huntingforest searching can be a very effective way to show the presence of snarks, but is
hopelessly inadequate for showing their absence.
The Humble Snark Hunter
• How can we prove there no snarks in the forest?– Get a map of the forest– Find the area between trees– Assume a safe minimum diameter of a snark– If minimum snark diameter > minimum tree separation no snarks in forest
• The gold standard, but:– You need a formal model of the forest – A mathematician– Substantial effort– As good as your model of forests and snarks (are snarks really spherical?)
Snark Hunting Via Random Testing
• REPEAT– Walk through the forest with a coin.– On encountering a fork, toss the coin:
• heads, go left• tails, go right
• UNTIL snark found or exhausted
• Easy to do: You don’t even need a map!• But:– Very low probability of finding a snark
Traditional Snark Hunting• Study the forest map and use your experience to choose the
places where snarks are likely to hide.• For each likely hiding place, write a sequence of “turn left”,
“turn right” instructions that will take you there.• REPEAT
– Choose an unused instruction sequence– Walk through the forest following the instructions
• UNTIL snark found or all instructions used
• But…– Snarks notoriously good at hiding where you don’t expect
Snark Hunting Via Static Coverage Analysis
• Get a map of the forest• Have a computer calculate instruction sequences that go
through all locations in the forest.• REPEAT
– Choose an unused instruction sequence– Walk through the forest following the instructions
• UNTIL snark found or enough of the forest covered
• But… – Lot of computing power to calculate the paths– there will be a lot of paths
Effective Snark Hunting Without A Map• Start with a blank Map
He had bought a large map representing the sea, Without the least vestige of land:
And the crew were much pleased when they found it to be A map they could all understand.
• REPEAT– REPEAT
• Walk through the forest with – a map (initially blank)– sequence of instructions (initially blank)
• Add each fork that you haven’t seen before to your map.• When encountering a fork:
– If there is an unused instruction, follow it– Otherwise, toss a coin as in random testing
– UNTIL you exit the forest• If there is a fork on your map with a branch not taken
– Write a sequence of instructions that lead down such a branch• UNTIL snark found, no untaken branches on map, you’re tired
Comparison of alternatives
14
Expertise/Effort
Accu
racy
Static analysis
DART
Model checking
Formal Verification
Traditional testing
How it Works• f(x,y) run: 1– Arbitrary
inputs:• x = 0• y = 9
x > y
x1 = x – 1;
x1 > y
false
false
x y = 0= 9
=-1
void f (int x, int y) { if (x > y) { x = x + y;
y = x – y – 3; x = x – y;
} x = x – 1; if (x > y) { abort (); } return;}
How it Works
• f(x,y) run: 2– choose x, y so– (x > y) = false– x1 = x – 1
– (x1 > y) = true
– no such x, y!
x > y
x1 = x – 1;
x1 > y
x y
void f (int x, int y) { if (x > y) { x = x + y;
y = x – y – 3; x = x – y;
} x = x – 1; if (x > y) { abort (); } return;}
x1 = x + y;y1 = x1 – y;x2 = x1 – y1 – 3;x3 = x2 – 1;
How it Works• f(x,y) run:2– choose x, y
so• (x > y) =
true– Inputs• x = 9• y = 0
x > y
x1 = x – 1;
x1 > y
true
false
x y
x3 > y1
=9
=0
=9=9=-3=-4
void f (int x, int y) { if (x > y) { x = x + y;
y = x – y – 3; x = x – y;
} x = x – 1; if (x > y) { abort (); } return;}
How it Works• f(x,y) run: 3– choose x, y so– (x > y) = true– x1 = x + y
– y1 = x1 – y
– x2 = x1 – y1 + 3
– x3 = x2 – 1
– (x3 > y1) = true– Inputs:– x = 1– y = 0
x1 = x + y;y1 = x1 – y;x2 = x1 – y1 – 3;x3 = x2 – 1;
x > y
x y
x3 > y1
abort
=1=0
true
=1
=1=-3=-4
true
void f (int x, int y) { if (x > y) { x = x + y;
y = x – y – 3; x = x – y;
} x = x – 1; if (x > y) { abort (); } return;}
int main () { const int x = choose_int ("x"); const int y = choose_int ("y"); snarky (x, y); return 0;}
• instrumentation library routine
void snarky (int x, int y) { if (x > y) { x = x + y;
y = x – y – 3; x = x – y;
} x = x – 1; if (x > y) { abort (); }}
The Program A Simple Test Harness
Quick Examplevoidstring_copy (const char *s, char *t) { int i; for (i=0; s[i] != '\0'; ++i) { t[i] = s[i]; }}
intstring_equal (const char *s, const char *t) { int i = 0; while (s[i] != '\0' && s[i] == t[i]) { ++i; }
int main () { const size_t source_length = choose_size_atmost (…); const char *source = choose_valid_string (…); const size_t target_size = choose_size_atleast (…); const char *target = choose_valid_char_array (…);
string_copy (source, target);
ensure (string_equal (source, target));
return 0;}
Quick example: Bug found
Bug found with the parameters:target_size = 1target[0] = 1
source_length = 0(Killed by signal)
Overall Design• Harness Library
– Supply specified values for inputs, or arbitrary values– Check required/ensured constraints
• Instrumentation– Modify a C program to produce an execution trace with the required
execution
• Observed Execution– Observe path taken by a run and calculate predicate describing a new
path• Constraint Solver
– Solver used to discover for a specified path condition• If the path is feasible• Inputs that would cause it to be executed
Testing Time
0102030405060708090
1 2 3 4 5 6 7 8• Don’t expect to test all paths for realistically sized data• You can, however, run many useful tests quickly
You Provide The Controllability• For each “unit” you write
– A harness to call unit’s functions– Stubs for functions the unit calls
• Provides functions to generate values– For harnesses to call with– For stubs to return with– Declarative specification of
constraints on the values• This provides
– A model of the unit’s environment– Controllability over the unit
Harness code
Stub code
Code under test
Front End:Instrumentation
Why do we track symbolic data?
if (x==y+3) {/* branch A */
} else {/* branch B */
}
To choose given branch, we need to solve:( x==y+3 ) == false/true
We want to be able to choose another branch next run..
To pass it to solver, we need to have x==y+3 expression in a symbolic form at if
In order to know it at this point, we should track assignments of constituent components..
Tracing symbolic data• Solution: adding special tracing statements to source
statements
x = y*z;
x = y*z;
tmp=trace_multiplication(VAR_Y,VAR_Z);
trace_assign(VAR_X,tmp);
x = y[i];
x = y[i];
tmp=trace_array_element(VAR_Y,VAR_I);
trace_assign(VAR_X,tmp);
CIL• “CIL (C Intermediate Language) is a high-level
representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.”
http://www.cs.berkeley.edu/~necula/cil/• CIL enables user application to explore and re-factor
various types of C source constructs (functions, blocks, statements, instructions, expressions, variables etc) in a convenient way while keeping the remaining code structure.
Tool Framework
Problem: CIL Based Frontend does not support C++Solution: Replace the CIL based frontend with LLVM to support C++
User written harness
Software under test
CIL Instrument
-ation
Instrumented Program
Run Scoreboard track coverage
Input Generator
SMT Solver
Frontend BackendUser Input
How CIL simplifies handling the code..
• Automatically rewrites C expressions with side effects:a = b+= --c ---> c = c-1; b = b+c; a = b;
• Uniformly represents memory references: (base+offset)• Converts do,for,while loops to
while (1) {if (cond1) break; /* if needed */if (cond2) continue; /* if needed */ body;}
• Traces control flow
What is LLVM?
• LLVM – Low Level Virtual Machine• Modular and reusable collection of libraries• Developed at UIUC and at Apple®• LLVM Intermediate Representation (IR) is well
designed and documented.• Has a production quality C++ frontend that is
compatible with GCC• Open-source with industry friendly license.• More info at www.llvm.org
LLVM frontend
LLVM IR
LLVM provides modular libraries and tool infrastructure to develop compiler passes
Clang C/C++ Parser
Compiler Pass
Rest of Compile
Instrumented Program
Backend
User written harness
Software under test
LLVM Based Frontend
Using C++ overloads• Idea: redefine operators such a way that they
output trace data:my_int operator + (my_int x, my_int y) {
symbolic s = trace_addition(x.symbol(),y.symbol());int c = x.val() + y.vall();return my_int (s,c);
}• Instrumentation is still needed (control tracing,
types..)
Reducing branches• This 2-branch control:
if (x && y) action1; else action2;really produces 3 branches in C/C++:
if (x){
if (y) action1; else action2;}
else action 1;• x && y is not really a logical and.– We cannot simply supply (x && y) to a SMT solver..
Reducing branches: solution• But.. Sometimes it IS logical and– Namely, if y may be safely evaluated at x==false or y
cannot be safely evaluated at any x valuewhich means– y has no side effectsand– y crash conditions don’t depend on x
• If we can prove this statically, use the form:if (logical_and(x,y)) action1; else action2;
• Else use 3-branch form
36
Solver Theories• Different solver theory– Linear Integer Arithmetic: (a*x + b*y + ….) {><=} C– Linear Real Arithmetic– BitVector Arithmetic
• Most conditions in C source code fits one of them. But some mixed/complex don’t– alas, sometimes using random alternation– luckily, theories are being developed actively
• Need to recognize theory patterns for better performance-> Sometimes supported scope is wider then declared theory
scope
Path exploration strategy
• Usually we explore all paths in Depth First Search mode:– alternate deeper ones first– when complete, return one
level and try again• But execution path count may
occur to be extremely high to explore all of them
Path exploration strategy -2 • If we have no resources to
explore all path, DFS is not the best strategy: some nodes never be visited while some others are carefully explored- low coverage coverage- most of dumb bugs may be missed
• Good strategy principle: first visit new nodes, next explore new paths- Details are subject to research
explored unexplored
An optimization: Get function properties• Idea: Taking advantage of code hierarchy: using I/O properies for function/procedure
call-> try to go with the assumptions only rather than deepening into subroutine body Example: y = string_copy (x)
require valid_pointer(x) property valid_pointer(y) /* assuming we have yet memory */ property length(x) == length(y) property i < length(x) y[i] == x [i]
• For black box (external library) code, assumptions should be supplied as collaterals• For available source code, they can also be extracted automatically
-> but it’s a question what to extract
If (length(s) > 2) { p = string_copy(s); if (length(s) >1) {} else { do_something();}}
If (length(s) > 2) { p = ???; assume length(p) == length(s); if (length(p) >1) {} else { /* lenghts(p) <=1 && length(p) == length(s) && lengths(s) > 2) ---- Infeasible */ }}
An optimization: Separate independent alternations
if (z == 2) { x=b; do_something1();}if (y == x) { do_something2();}
Dependent choices
We should try 2*2 combinations:• z=2, y=b• z=2, y≠b• z≠2, y=x• z≠2, y≠x
(all variables are sampled at the beginning of code piece presented)
Separate independent alternations -2
if (z == 2) { q = b;}if (y == x) { p = c;}
Independent choices
We can try only 2 combinations, for example:• z=2, y=x• z≠2, y≠x
(provided that do_something1() and do_something2() effects don’t interdepend)
Separate independent alternations -3
if (z == 2) { q = b;}if (y == x) { p = c;}if (q == p) …
Dependent choices again!
An optimization: re-using unsatisfied conditions
if (a && b && c) { …}if (a && b && c) { …}if (a && b && c
&& d &&e) { …}
Let we’ve proved that we cannot get here
Then, we can be sure that we cannot get there too No need to call a solver
again
Handling Black Boxed Code
Contents• Motivation• Losing control with black boxes • Return Value Representation• Randomization• Characterizing• Learning• Stubbing/Wrapping• Example: The encryption problem• Selective/Dynamic Black-Boxing• Embedded White-Boxes• Afterwords
Motivation
• Testing a portion of a code within a large system. E.g:– Code over infrastructure/library functions– Firmware over hardware API/Virtual Platform– Binary infrastructure
• Hiding Code solver can’t cope with– Non Linear arithmetic (a*b = C)– Assembly
• Handling Deep paths/Recursion
Losing Control with Black Boxes
• Black-boxes impair our controllability when program paths are influenced by black-box outputs.
• We have no information to pick “a” such that it drives (b > 10) in both directions.
int a = choose_int(“a”);int b = bb(a);if (b > 10) { … } else { … }
Return Value Representation
• The flow treats the return value of an uninstrumented function as concrete only (not symbolic).
• But it can be explicitly assigned a fresh symbolic variable with fresh_*
• The reverse could be done as well with concrete_* (later).
int a = blackboxed_func(…); // a is concrete fresh_integer(a, “a”); // a is symbolic
Example: The Encryption Problemulong x = choose_uint ("x%d", count);ulong y = choose_uint ("y%d", count);if (y == encrypt(x)) <…>;else <…>;
• We pathologically can’t guess x and y beforehand such that y == encrypt(x).
coping with it by:• Running once with y=x=0, the condition fails.• “see”: (y == <concrete encrypt(0)>)• choose x=0, y = encrypt(0) for the 2nd run.
Randomization
• We can increase our chances of gaining coverage by adding randomization
int a = choose_random_int(“a”);int b = bb(a);if (b > 10) { … } else { … }
Characterizing: assert• Our first step to gain back control is having the user tell us
something about the function.• A new construct is added: assert(<cond>)
Reminder:– require(<cond>) : Assume <cond> holds. If it doesn’t ignore
current path and move on. This is actually a branch equivalent toif (!cond) exit(0);
– ensure(<cond>) : Make sure <cond> holds. If it doesn’t - stop execution and report it. If it does, try to make it fail. This is actually a branch equivalent toif (!cond) abort();
assert – 2.• Eg: a strictly monotonic black boxed function.
• assert(<cond>): Assume <cond> holds. If it doesn’t – stop execution and report. But don’t try to make it fail.
• Must use with fresh_*• Full characterization: solves the problem, but impractical.• Partial characterization:
– May help the solver – depending on its internal heuristics.– The more assertions the better.
int a = choose_int(“a”);int b = bb(a);fresh_integer(b, “b”);assert(b > a);if (b > 10) { … } else { … }
Learning
• Learn from concrete inputs and outputs of a black-boxed function – and use if future runs.
• But what to learn?:Function is not always deterministic: has implicit “inputs” and “outputs” / internal state.
• Instead of learning functions, we learn a “subject” in many lessons. Each lesson can have multiple inputs and outputs.
Learning - 2.int a = choose_int(“a”);lesson l = begin_lesson ();learn_integer(l, a, LEARN_INPUT);int b = bb(a);fresh_integer(b, “b”);learn_integer(l, b, LEARN_OUTPUT);end_lesson(l);if (b > 10) { … } else { … }
Learning - 3.
• When it misses a path it can retry it several times, and learn new concrete values in each try.
• Previous learning will add constraints to the solver, so previous inputs will not be reused when trying to get a different outputs.
Learning – 4.
• The “subject” is supposed to be common to all invocations of a black-boxed function, but is different from function to function.
• An easier function: begin_lesson() implicitly creates a unique subject from the code location.
• Problem: the function is invoked from different places in the code. We want to write learning once.
• Solution: We shall later see how we can easily write wrappers to divert all calls to one place.
Stubbing / Wrapping• User will write a wrapper to add
characterization and learning sugaring to all invocations of a function:
int bb_wrap(int a){ lesson l = begin_lesson (); learn_integer(l, a, LEARN_INPUT); int b = bb(a); fresh_integer(b, “b”); learn_integer(l, b, LEARN_OUTPUT); end_lesson(l); assert(b > a); return(a); }
Stubbing / Wrapping – 2.
• Now we want to call the wrapper bb_wrap, instead of bb.
• But we don’t want to manually change all invocations in our code.
• instrument does it for us:instrument –stub bb:bb_wrap –stub
… -stub …
• Conveniently it won’t replace calls within the stub code itself.
Selective Blackboxing
• We can selectively blackbox instrumented code:
• We must use fresh_* or concrete_* on values that were defined/modified inside the blackbox and are visible outside of it.
• Otherwise it might think it is uninitialized, or miscalculate paths.
begin_blackbox(true);int x = 10;end_box();concrete_integer(x);
Dynamic Blackboxing - 1.• Sometimes we just have too many paths:
• If we can’t change the string length it will see 2^100 paths…
char* s = choose_valid_string("str", 100);int count_a = 0;for(int i=0; i<100; i++){ if (s[i] == 'a') { count_a++; }}
Dynamic Black-boxing – 2.• We can dynamically hide code:
• Now XXX sees only 2^2 paths.
char* s = choose_valid_string("str", 100);int count_a = 0;for(int i=0; i<100; i++){
begin_blackbox(i>2); if (s[i] == 'a') { count_a++; } end_box();}
Embedded White-boxes• What if we want to look into some code that
is called from black-boxed code? E.g a callback:
void callback(int x) { … } // we want to see this code
int main (){ int x = choose_uint("x"); bb_foo(x, callback); // bb_foo is blackboxed return 0;}
Embedded White-boxes - 2.
• On the “inside” we “whitebox” it expicitly, and freshen (or concretize) the inputs:void callback(int x) { static int count = 0; begin_whitebox(true); fresh_integer(x, “new_x%d”, count); end_box(); ++count;} // we want to see this code
Afterwords
• We have established a “Swiss army knife” of features to support future black-box challenges.
• This helps overcome some simple synthetic examples.
• Since the loss of controllability is generally hard, we expect we’d need to refine this set of features as we hit real life test-cases.