Realization of solver based techniques for Dynamic Software Verification

Realization of solver based techniques for Dynamic Software Verification

Andreas S ScherbakovIntel Corporation

[email protected]

What’s program testing here?• The problem: to test a programmeans• Find at least one set of input values such that– a crash/an Illegal operation occur

or– some user defined property has violated (unexpected

results/behaviour)or• Prove correctness of the program– or at least demonstrate that it’s correct with some high

probability

SW testing: basic approaches• Random testing

-> You execute your program repeatedly with random input values..+ covers a lot of unpredictable cases─ too much redundant iterations -> out of resources

• “Traditional “ testing - Custom test suites-> You know you code and therefore you can create necessary examples to test it?..

+ targets known critical points─ misses most of unusual use cases─ large effort, requires intimate knowledge of the code

• Directed testing-> Try to get a significantly different run each attempt..+ explores execution alternatives rapidly+ effective for mixed whitebox/blackbox code─ usually needs some collateral code─ takes large resources if poorly optimized

SW testing: basic approaches - 2• Static Analysis

Commercial tools: Coverity, Klocwork, …─ Find dumb bugs, not application logic errors─ Finds some “false positive” bugs, misses many real bugs+ Good performance+ Little expertise required

• Model Checking Academic and competitor tools: BLAST, CBMC, SLAM/SDV+ Finds application logic errors─ Finds some “false positive” bugs, but doesn’t miss any real ones─ Significant user expertise required

• Formal Verification Academic tools: HOL, Isabelle, …+ Ultimate guarantee: proves conformance with specification─ Scaling constraint is human effort, not machine time─ Ultimate user expertise required: multiple FV PhDs

Directed Testing:as few runs as possible

• executes the program with two test cases: i=0 and i=5

• 100% branch coverage

DART: Directed Automated Random Testing

• Main idea has been proposeded in Patrice Godefroid, Nils Klarlund, and Koushik Sen. DART:

Directed Automated Random Testing. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. PLDI 2005: 213-223.

• Dependent upon a Satisfiability Modulo Theories (SMT) solvers-> SMT solvers are applications able to solve equation sets.

A theory here implies methods related to some set of allowed data types/operands

http://doi.acm.org/10.1145/1065010.1065036

7

What does it check?

• Does not verify the correctness of the program UNLESS YOU HAVE Express the meaning of CORRECTNESS in form of ASSERTION CHECKERs– Can not infer what the ‘correct’ behavior of the program is

• What does it check– allows users to add assumptions to limit the search space and assertions

(‘ensure‘) to define the expected behavior. – Assertions are treated as (‘bad’) branches – so test process will try to reach

them, or formally verify it is impossible.– ‘built in’ checks for crashes, divide by 0, memory corruption

• requires some familiarity with the software under test for effectiveness.

Looking for a Bug in a Program• A bug is a like a snark• A program is like a forest with many

paths• Source code is like a map of the

forest

Just the place for a Snark! I have said it twice:

That alone should encourage the crew.Just the place for a Snark!I have said it thrice:

What I tell you three times is true.

The Hunting of the SnarkLewis Carroll

Looking for a Snark in a Forest

Proof Rather than Snark Huntingforest searching can be a very effective way to show the presence of snarks, but is

hopelessly inadequate for showing their absence.

The Humble Snark Hunter

• How can we prove there no snarks in the forest?– Get a map of the forest– Find the area between trees– Assume a safe minimum diameter of a snark– If minimum snark diameter > minimum tree separation no snarks in forest

• The gold standard, but:– You need a formal model of the forest – A mathematician– Substantial effort– As good as your model of forests and snarks (are snarks really spherical?)

Snark Hunting Via Random Testing

• REPEAT– Walk through the forest with a coin.– On encountering a fork, toss the coin:

• heads, go left• tails, go right

• UNTIL snark found or exhausted

• Easy to do: You don’t even need a map!• But:– Very low probability of finding a snark

Traditional Snark Hunting• Study the forest map and use your experience to choose the

places where snarks are likely to hide.• For each likely hiding place, write a sequence of “turn left”,

“turn right” instructions that will take you there.• REPEAT

– Choose an unused instruction sequence– Walk through the forest following the instructions

• UNTIL snark found or all instructions used

• But…– Snarks notoriously good at hiding where you don’t expect

Snark Hunting Via Static Coverage Analysis

• Get a map of the forest• Have a computer calculate instruction sequences that go

through all locations in the forest.• REPEAT

– Choose an unused instruction sequence– Walk through the forest following the instructions

• UNTIL snark found or enough of the forest covered

• But… – Lot of computing power to calculate the paths– there will be a lot of paths

Effective Snark Hunting Without A Map• Start with a blank Map

He had bought a large map representing the sea, Without the least vestige of land:

And the crew were much pleased when they found it to be A map they could all understand.

• REPEAT– REPEAT

• Walk through the forest with – a map (initially blank)– sequence of instructions (initially blank)

• Add each fork that you haven’t seen before to your map.• When encountering a fork:

– If there is an unused instruction, follow it– Otherwise, toss a coin as in random testing

– UNTIL you exit the forest• If there is a fork on your map with a branch not taken

– Write a sequence of instructions that lead down such a branch• UNTIL snark found, no untaken branches on map, you’re tired

Comparison of alternatives

14

Expertise/Effort

Accu

racy

Static analysis

DART

Model checking

Formal Verification

Traditional testing

How it Works• f(x,y) run: 1– Arbitrary

inputs:• x = 0• y = 9

x > y

x1 = x – 1;

x1 > y

false

false

x y = 0= 9

=-1

void f (int x, int y) { if (x > y) { x = x + y;

y = x – y – 3; x = x – y;

} x = x – 1; if (x > y) { abort (); } return;}

How it Works

• f(x,y) run: 2– choose x, y so– (x > y) = false– x1 = x – 1

– (x1 > y) = true

– no such x, y!

x > y

x1 = x – 1;

x1 > y

x y


y = x – y – 3; x = x – y;


x1 = x + y;y1 = x1 – y;x2 = x1 – y1 – 3;x3 = x2 – 1;

How it Works• f(x,y) run:2– choose x, y

so• (x > y) =

true– Inputs• x = 9• y = 0

x > y

x1 = x – 1;

x1 > y

true

false

x y

x3 > y1

=9

=0

=9=9=-3=-4


y = x – y – 3; x = x – y;


How it Works• f(x,y) run: 3– choose x, y so– (x > y) = true– x1 = x + y

– y1 = x1 – y

– x2 = x1 – y1 + 3

– x3 = x2 – 1

– (x3 > y1) = true– Inputs:– x = 1– y = 0

x1 = x + y;y1 = x1 – y;x2 = x1 – y1 – 3;x3 = x2 – 1;

x > y

x y

x3 > y1

abort

=1=0

true

=1

=1=-3=-4

true


y = x – y – 3; x = x – y;


int main () { const int x = choose_int ("x"); const int y = choose_int ("y"); snarky (x, y); return 0;}

• instrumentation library routine

void snarky (int x, int y) { if (x > y) { x = x + y;

y = x – y – 3; x = x – y;

} x = x – 1; if (x > y) { abort (); }}

The Program A Simple Test Harness

Quick Examplevoidstring_copy (const char *s, char *t) { int i; for (i=0; s[i] != '\0'; ++i) { t[i] = s[i]; }}

intstring_equal (const char *s, const char *t) { int i = 0; while (s[i] != '\0' && s[i] == t[i]) { ++i; }

int main () { const size_t source_length = choose_size_atmost (…); const char *source = choose_valid_string (…); const size_t target_size = choose_size_atleast (…); const char *target = choose_valid_char_array (…);

string_copy (source, target);

ensure (string_equal (source, target));

return 0;}

Quick example: Bug found

Bug found with the parameters:target_size = 1target[0] = 1

source_length = 0(Killed by signal)

Overall Design• Harness Library

– Supply specified values for inputs, or arbitrary values– Check required/ensured constraints

• Instrumentation– Modify a C program to produce an execution trace with the required

execution

• Observed Execution– Observe path taken by a run and calculate predicate describing a new

path• Constraint Solver

– Solver used to discover for a specified path condition• If the path is feasible• Inputs that would cause it to be executed

Testing Time

0102030405060708090

1 2 3 4 5 6 7 8• Don’t expect to test all paths for realistically sized data• You can, however, run many useful tests quickly

You Provide The Controllability• For each “unit” you write

– A harness to call unit’s functions– Stubs for functions the unit calls

• Provides functions to generate values– For harnesses to call with– For stubs to return with– Declarative specification of

constraints on the values• This provides

– A model of the unit’s environment– Controllability over the unit

Harness code

Stub code

Code under test

Front End:Instrumentation

Why do we track symbolic data?

if (x==y+3) {/* branch A */

} else {/* branch B */

}

To choose given branch, we need to solve:( x==y+3 ) == false/true

We want to be able to choose another branch next run..

To pass it to solver, we need to have x==y+3 expression in a symbolic form at if

In order to know it at this point, we should track assignments of constituent components..

Tracing symbolic data• Solution: adding special tracing statements to source

statements

x = y*z;

x = y*z;

tmp=trace_multiplication(VAR_Y,VAR_Z);

trace_assign(VAR_X,tmp);

x = y[i];

x = y[i];

tmp=trace_array_element(VAR_Y,VAR_I);

trace_assign(VAR_X,tmp);

CIL• “CIL (C Intermediate Language) is a high-level

representation along with a set of tools that permit easy analysis and source-to-source transformation of C programs.”

http://www.cs.berkeley.edu/~necula/cil/• CIL enables user application to explore and re-factor

various types of C source constructs (functions, blocks, statements, instructions, expressions, variables etc) in a convenient way while keeping the remaining code structure.

http://www.cs.berkeley.edu/~necula/cil/

Tool Framework

Problem: CIL Based Frontend does not support C++Solution: Replace the CIL based frontend with LLVM to support C++

User written harness

Software under test

CIL Instrument

-ation

Instrumented Program

Run Scoreboard track coverage

Input Generator

SMT Solver

Frontend BackendUser Input

How CIL simplifies handling the code..

• Automatically rewrites C expressions with side effects:a = b+= --c ---> c = c-1; b = b+c; a = b;

• Uniformly represents memory references: (base+offset)• Converts do,for,while loops to

while (1) {if (cond1) break; /* if needed */if (cond2) continue; /* if needed */ body;}

• Traces control flow

What is LLVM?

• LLVM – Low Level Virtual Machine• Modular and reusable collection of libraries• Developed at UIUC and at Apple®• LLVM Intermediate Representation (IR) is well

designed and documented.• Has a production quality C++ frontend that is

compatible with GCC• Open-source with industry friendly license.• More info at www.llvm.org

http://www.llvm.org/

LLVM frontend

LLVM IR

LLVM provides modular libraries and tool infrastructure to develop compiler passes

Clang C/C++ Parser

Compiler Pass

Rest of Compile

Instrumented Program

Backend

User written harness

Software under test

LLVM Based Frontend

Using C++ overloads• Idea: redefine operators such a way that they

output trace data:my_int operator + (my_int x, my_int y) {

symbolic s = trace_addition(x.symbol(),y.symbol());int c = x.val() + y.vall();return my_int (s,c);

}• Instrumentation is still needed (control tracing,

types..)

Reducing branches• This 2-branch control:

if (x && y) action1; else action2;really produces 3 branches in C/C++:

if (x){

if (y) action1; else action2;}

else action 1;• x && y is not really a logical and.– We cannot simply supply (x && y) to a SMT solver..

Reducing branches: solution• But.. Sometimes it IS logical and– Namely, if y may be safely evaluated at x==false or y

cannot be safely evaluated at any x valuewhich means– y has no side effectsand– y crash conditions don’t depend on x

• If we can prove this statically, use the form:if (logical_and(x,y)) action1; else action2;

• Else use 3-branch form

36

Solver Theories• Different solver theory– Linear Integer Arithmetic: (a*x + b*y + ….) {><=} C– Linear Real Arithmetic– BitVector Arithmetic

• Most conditions in C source code fits one of them. But some mixed/complex don’t– alas, sometimes using random alternation– luckily, theories are being developed actively

• Need to recognize theory patterns for better performance-> Sometimes supported scope is wider then declared theory

scope

Path exploration strategy

• Usually we explore all paths in Depth First Search mode:– alternate deeper ones first– when complete, return one

level and try again• But execution path count may

occur to be extremely high to explore all of them

Path exploration strategy -2 • If we have no resources to

explore all path, DFS is not the best strategy: some nodes never be visited while some others are carefully explored- low coverage coverage- most of dumb bugs may be missed

• Good strategy principle: first visit new nodes, next explore new paths- Details are subject to research

explored unexplored

An optimization: Get function properties• Idea: Taking advantage of code hierarchy: using I/O properies for function/procedure

call-> try to go with the assumptions only rather than deepening into subroutine body Example: y = string_copy (x)

require valid_pointer(x) property valid_pointer(y) /* assuming we have yet memory */ property length(x) == length(y) property i < length(x) y[i] == x [i]

• For black box (external library) code, assumptions should be supplied as collaterals• For available source code, they can also be extracted automatically

-> but it’s a question what to extract

If (length(s) > 2) { p = string_copy(s); if (length(s) >1) {} else { do_something();}}

If (length(s) > 2) { p = ???; assume length(p) == length(s); if (length(p) >1) {} else { /* lenghts(p) <=1 && length(p) == length(s) && lengths(s) > 2) ---- Infeasible */ }}

An optimization: Separate independent alternations

if (z == 2) { x=b; do_something1();}if (y == x) { do_something2();}

Dependent choices

We should try 2*2 combinations:• z=2, y=b• z=2, y≠b• z≠2, y=x• z≠2, y≠x

(all variables are sampled at the beginning of code piece presented)

Separate independent alternations -2

if (z == 2) { q = b;}if (y == x) { p = c;}

Independent choices

We can try only 2 combinations, for example:• z=2, y=x• z≠2, y≠x

(provided that do_something1() and do_something2() effects don’t interdepend)

Separate independent alternations -3

if (z == 2) { q = b;}if (y == x) { p = c;}if (q == p) …

Dependent choices again!

An optimization: re-using unsatisfied conditions

if (a && b && c) { …}if (a && b && c) { …}if (a && b && c

&& d &&e) { …}

Let we’ve proved that we cannot get here

Then, we can be sure that we cannot get there too No need to call a solver

again

Handling Black Boxed Code

Contents• Motivation• Losing control with black boxes • Return Value Representation• Randomization• Characterizing• Learning• Stubbing/Wrapping• Example: The encryption problem• Selective/Dynamic Black-Boxing• Embedded White-Boxes• Afterwords

Motivation

• Testing a portion of a code within a large system. E.g:– Code over infrastructure/library functions– Firmware over hardware API/Virtual Platform– Binary infrastructure

• Hiding Code solver can’t cope with– Non Linear arithmetic (a*b = C)– Assembly

• Handling Deep paths/Recursion

Losing Control with Black Boxes

• Black-boxes impair our controllability when program paths are influenced by black-box outputs.

• We have no information to pick “a” such that it drives (b > 10) in both directions.

int a = choose_int(“a”);int b = bb(a);if (b > 10) { … } else { … }

Return Value Representation

• The flow treats the return value of an uninstrumented function as concrete only (not symbolic).

• But it can be explicitly assigned a fresh symbolic variable with fresh_*

• The reverse could be done as well with concrete_* (later).

int a = blackboxed_func(…); // a is concrete fresh_integer(a, “a”); // a is symbolic

Example: The Encryption Problemulong x = choose_uint ("x%d", count);ulong y = choose_uint ("y%d", count);if (y == encrypt(x)) <…>;else <…>;

• We pathologically can’t guess x and y beforehand such that y == encrypt(x).

coping with it by:• Running once with y=x=0, the condition fails.• “see”: (y == <concrete encrypt(0)>)• choose x=0, y = encrypt(0) for the 2nd run.

Randomization

• We can increase our chances of gaining coverage by adding randomization

int a = choose_random_int(“a”);int b = bb(a);if (b > 10) { … } else { … }

Characterizing: assert• Our first step to gain back control is having the user tell us

something about the function.• A new construct is added: assert(<cond>)

Reminder:– require(<cond>) : Assume <cond> holds. If it doesn’t ignore

current path and move on. This is actually a branch equivalent toif (!cond) exit(0);

– ensure(<cond>) : Make sure <cond> holds. If it doesn’t - stop execution and report it. If it does, try to make it fail. This is actually a branch equivalent toif (!cond) abort();

assert – 2.• Eg: a strictly monotonic black boxed function.

• assert(<cond>): Assume <cond> holds. If it doesn’t – stop execution and report. But don’t try to make it fail.

• Must use with fresh_*• Full characterization: solves the problem, but impractical.• Partial characterization:

– May help the solver – depending on its internal heuristics.– The more assertions the better.

int a = choose_int(“a”);int b = bb(a);fresh_integer(b, “b”);assert(b > a);if (b > 10) { … } else { … }

Learning

• Learn from concrete inputs and outputs of a black-boxed function – and use if future runs.

• But what to learn?:Function is not always deterministic: has implicit “inputs” and “outputs” / internal state.

• Instead of learning functions, we learn a “subject” in many lessons. Each lesson can have multiple inputs and outputs.

Learning - 2.int a = choose_int(“a”);lesson l = begin_lesson ();learn_integer(l, a, LEARN_INPUT);int b = bb(a);fresh_integer(b, “b”);learn_integer(l, b, LEARN_OUTPUT);end_lesson(l);if (b > 10) { … } else { … }

Learning - 3.

• When it misses a path it can retry it several times, and learn new concrete values in each try.

• Previous learning will add constraints to the solver, so previous inputs will not be reused when trying to get a different outputs.

Learning – 4.

• The “subject” is supposed to be common to all invocations of a black-boxed function, but is different from function to function.

• An easier function: begin_lesson() implicitly creates a unique subject from the code location.

• Problem: the function is invoked from different places in the code. We want to write learning once.

• Solution: We shall later see how we can easily write wrappers to divert all calls to one place.

Stubbing / Wrapping• User will write a wrapper to add

characterization and learning sugaring to all invocations of a function:

int bb_wrap(int a){ lesson l = begin_lesson (); learn_integer(l, a, LEARN_INPUT); int b = bb(a); fresh_integer(b, “b”); learn_integer(l, b, LEARN_OUTPUT); end_lesson(l); assert(b > a); return(a); }

Stubbing / Wrapping – 2.

• Now we want to call the wrapper bb_wrap, instead of bb.

• But we don’t want to manually change all invocations in our code.

• instrument does it for us:instrument –stub bb:bb_wrap –stub

… -stub …

• Conveniently it won’t replace calls within the stub code itself.

Selective Blackboxing

• We can selectively blackbox instrumented code:

• We must use fresh_* or concrete_* on values that were defined/modified inside the blackbox and are visible outside of it.

• Otherwise it might think it is uninitialized, or miscalculate paths.

begin_blackbox(true);int x = 10;end_box();concrete_integer(x);

Dynamic Blackboxing - 1.• Sometimes we just have too many paths:

• If we can’t change the string length it will see 2^100 paths…

char* s = choose_valid_string("str", 100);int count_a = 0;for(int i=0; i<100; i++){ if (s[i] == 'a') { count_a++; }}

Dynamic Black-boxing – 2.• We can dynamically hide code:

• Now XXX sees only 2^2 paths.

char* s = choose_valid_string("str", 100);int count_a = 0;for(int i=0; i<100; i++){

begin_blackbox(i>2); if (s[i] == 'a') { count_a++; } end_box();}

Embedded White-boxes• What if we want to look into some code that

is called from black-boxed code? E.g a callback:

void callback(int x) { … } // we want to see this code

int main (){ int x = choose_uint("x"); bb_foo(x, callback); // bb_foo is blackboxed return 0;}

Embedded White-boxes - 2.

• On the “inside” we “whitebox” it expicitly, and freshen (or concretize) the inputs:void callback(int x) { static int count = 0; begin_whitebox(true); fresh_integer(x, “new_x%d”, count); end_box(); ++count;} // we want to see this code

Afterwords

• We have established a “Swiss army knife” of features to support future black-box challenges.

• This helps overcome some simple synthetic examples.

• Since the loss of controllability is generally hard, we expect we’d need to refine this set of features as we hit real life test-cases.

Date post:	23-Feb-2016
Category:	Documents
Upload:	gasha
View:	43 times
Download:	0 times

Realization of solver based techniques for Dynamic Software Verification

Documents