Regression Verification for Multi-Threaded Programs

transcript

Regression Verification

for Multi-Threaded

ProgramsSagar Chaki, SEI-Pittsburgh

Arie Gurfinkel, SEI-PittsburghOfer Strichman, Technion-Haifa

Originally presented at VMCAI 2012

VSSE 2012

Regression Verification

Formal equivalence checking of two similar programs

Three selling points: Specification:

– not needed Complexity:

– dominated by the change, not by the size of the programs. Invariants:

– Easy to derive automatically high-quality loop/recursion invariants

Tool support: RVT (Technion) SymDiff (MSR)

Possible applications of regression verification Understand how changes propagate through the API

Validate refactoring

Semantic Impact Analysis

Generic Translation Validation

Proving Determinism P is equivalent to itself , P is deterministic

many more…

Definition: Partial Equivalence

Let f1 and f2 be functions with equivalent I/O signature

f1 and f2 are partially equivalent , executions of f1 and f2 on equal inputs

…which terminate, result in equal outputs.

Undecidable

Multi-Threaded Programs (MTPs)

Finite set of root functions :

P = f1 || … || fn

• Each executed in a separate thread• Communicate via shared variables

is P partially equivalent to itself?

…no!

int base = 1;

void f1(int n, int *r) { if (n < 1) *r = base; else { int t; f1(n-1, &t); *r = n * t; }}

void f2() { base = 2;}

shared variable

Computes n! or 2*n!

threadthread

Towards partial equivalence of MTPs

P is a multi-threaded program (P): the set of terminating computations of P R(P) = { (in, out) | ∃ ∈ (P). begins in in and ends

in out}

Example:R (P) = {(n, n!), (n, 2n!) | n ∈Z }

int base = 1;

Partial equivalence of MTPs

MTPs P1, P2 are partially equivalent if R(P1) = R(P2).

Denoted p.e.n. (P1, P2) Partial Equivalence of Nondeterministic programs

Claim: 8P. ² p.e.n.(P,P)

Towards a sound proof rule for

partial equivalence of MTPs

- The sequential case

- Challenges in extending it to p.e.n.

- A proof rule

First: The sequential case [GS’08]

Loops/recursion are not a big problem in practice:

Recursive calls are replaced with calls to the same uninterpreted function:

Abstracts the callees By construction assumes the callees are partially equivalent

Sound by induction

f1 f2=f1 f2

=isolation

First: The sequential case [GS’08]

Now we have two isolated functions f1, f2 Checking partial equivalence is decidable:

in1 = in2 = *; o1 = f1(in); o2 = f2(in); assert (o1==o2)

The algorithm traverses up the call graphs while abstracting equal callees.

How can this method be extended to proving p.e.n(f1,f2) ?

f2(int in) { x1 = in; x2 = in;}

What affects partial equivalence of MTPs?

Before: in = 1o1 = 1, o2 = 0;

(1, h1, 0i) 2 R(p)

f1() { o1 = x1; o2 = x2;}

x1 = x2 = 0

After:in = 1o1 = 1 ) o2 = 1

(1, h1, 0i) R(p)

Swap write order

x2 = in;x1 = in;

f1(int in1) { x1 = in1; t1 = x2; o1 = t1;}

f2(int in2) { t2 = x1; x2 = in2; o2 = t2;}

After: in1 = 1, in2 = 2o2 = 1 ) x1 = 1 < t2 = x1 )t1 = 0 )o1 = 0(h1,2i, h2,1i) R(p)

x1 = x2 = 0

Before:in1 = 1, in2 = 2x1 = 1; t2 = x1 = 1; x2 = in2 = 2;t1 = x2 = 2; o1 = t1 = 2; o2 = t2 = 1;(h1,2i, h2,1i) 2 R(p)

Swap R/W order

t1 = x2; x1 = in1;

f2(int in) { x1 = in; x2 = in;}

Before: in = 1o1 = 0, o2 = 1;

(1, h0, 1i) 2 R(p)

f1() { o1 = x1; o2 = x2;}

x1 = x2 = 0

After:in = 1o1 = 0 ) o2 = 0

(1, h0, 1i) R(p)

Swap read order

o2 = x2;o1 = x1;

Preprocessing

1. Loops ) recursive functions2. Mutual recursion ) simple recursion3. Non-recursive functions ) inlined

4. Function’s return value ) parameter5. Each shared variable x only appears in

t = x or x = exp (add auxiliaries)6. Each auxiliary variable is read once

int base = 1;

Mapping

Assume each function is used in a single thread. Otherwise, duplicate it

Find a mapping between the non-basic types

Find a bijective map between: threads shared variables functions (same proto-type), in mapped functions: read globals, written-to globals

Without such a mapping: goto end-of-talk.

The Observable Stream

Consider a function f and input in

The observable stream of f(in)’s run is its sequence of function calls read/write of shared variables

Example: let x be a shared variable:

The observable stream:

x = in; W(x,1)t1 = t;t = x; R(x,1)g(t+1); Call(g, <2>);

Observable Equivalence of Functions

Denote by observe-equiv(f,f’)

Assume: outputs are defined via shared variables Hence: observable equivalence ) partial equivalence

f, f’ are observably equivalent , 8in. f(in), f’(in) have equal sets of finite observable streams

We prove observe-equiv(f,f’) by isolating f,f’. proving observ. equivalence of all (isolated) pairs

Given f, we build an isolated version [f], in which: Function calls are replaced with calls to uninterpreted

functions Shared variables are read from an “input” stream The observable stream is recorded

Observable Equivalence of Functions

Reading from an input stream…

For each shared variable x: R(x) Ã R(*) // read

R(x)W(x)g(…)W(x)R(x)R(x)

R(*)R(*)

R(x)localW(x)g(…)W(x)R(x)R(x)

R(x)W(x)W(x)localg(…)R(x)localR(x)

R(*1) R(*1)

R(*2) R(*2)R(*3)

Enforce: equal locations ) same ‘*’ value… if their streams up to that point were equal.

Reading from an input stream…

Summary: from f to [f]

Transform function f and f’ to [f] and [f’] by:

Function calls are replaced with calls to UFs– t = g() ) t = UFg()– For (g,g’) 2 map, UFg = UFg’

Shared variables are read from an “input” stream: – t = x ) t = UFx(loc) // loc is the location in the stream– For (x,x’)2 map, Ufx = Ufx’

The Observable Stream is recorded.

Example: from f to [f]

int base = 1;

void f (int n, int *r) { int t; if (n < 1) { t = base; *r = t;} else { f(n-1,&t); *r = n * t; }}

function call

list out;

void [f] (int n, int *r) { int t, loc = 0; if (n < 1) { t = UFbase(loc); out += (R,”base”); loc++; out += (W,”r”,t); loc++;} else { t = UFf,t(n-1); out += (C,f,n-1); out += (W,”r”, n * t); loc++; }}

Output treated as shared

Observable stream

Checking observable equivalence of [f], [f’] Generate sequential program S:

in1 = in2 = *;[f(in1)];[f’(in2)];rename ([f’].out); // according to mapassert([f].out == [f’].out);

Validity of S is decidable

[f][f’]

S CBMC

A Proof Rule

Compositional: check one function pair at a time

Sequential: no thread composition

General: supports loops/recursion + arbitrary # of threads

8f, f’ 2 map. observe-equiv([f], [f’])p.e.n. (P1, P2)

Summary and Future Thoughts

We suggested foundations of regression verification for MTPs

• Notion of partial equivalence of multi-threaded programs• A proof rule

Challenges ahead: Synchronization primitives: Locks, semaphores, atomic

blocks Dynamic thread creation Make more complete

Regression Verification for Multi-Threaded Programs

Documents