IBM Research © 2007 IBM Corporation IBM Haifa Labs © 2006 IBM Corporation Cross-Entropy Based...

IBM Research

© 2007 IBM CorporationIBM Haifa Labs © 2006 IBM Corporation

Cross-Entropy BasedTesting

Hana Chockler, Benny Godlin, Eitan Farchi, Sergey Novikov

Research

Haifa, Israel

© 2003 IBM Corporation2

IBM Research

2

The problem: How to test for rare problems in large

programs?Testing involves running the program many times, hoping to find

the problem.o If a problem appears only in a small fraction of the runs, it is

unlikely to be found during random executions.

searching for a needle in haystack


IBM Research

3

The main idea:

The cross-entropy method is a widely used approachto estimating probabilities of rare events (Rubinstein).

Use thecross-entropy

method!


IBM Research

4

The cross-entropy method - motivation

The problem:o There is a probability space S with probability distribution f

and a performance function P defined on it.o A rare event e is that P(s) > r, for some s 2 S and some r

o How can we estimate the probability of e?

Space S

input in which the rare event e

occurs

s

and this happens very rarely under f


IBM Research

5

The naïve idea

Generate a big enough sample and compute the probability of the rare event from the inputs in the sample

a huge sample

from the probability space

This won’t work because for very rare events even a very large sample does not reflect

the probability correctly


IBM Research

6

A wishful thinking: if we had a distribution that gives the good inputs the probability 1, then we would be all set …

So we try to approximate it in iterations, every time trying to come a little closer:

o In each iteration, we generate a sample of some (large) size.

o We update the parameters (the probability distribution) so that we get a better sample in the next iteration.

The cross-entropy method

But we don’t have such a distribution

w.r.t. the performance function


IBM Research

7

Formal definition of cross-entropy

In information theory, the cross entropy, or the Kullback-Leibler “distance” between two probability distributions p and q measures the average number of bits needed to identify an event from a set of possibilities, if a coding scheme is used based on a given probability distribution q, rather than the "true" distribution p.

The cross entropy for two distributions p and q over the same discrete probability space is defined as follows:

H(p,q) = - x p(x) log(q(x))

not really a distance, because it is not symmetric


IBM Research

8

In optimization problems, we are looking for inputs that maximize the performance function.

The main problem is that this maximum is unknown beforehand.The stopping point is when the sample has a small relative standard deviation.The method was successfully applied to a variety of graph optimization problems:

o MAX-CUTo Traveling salesmano …

[Rubinstein]

The cross-entropy methodfor optimization problems


IBM Research

9

Illustration

Per

form

ance

fun

ctio

nUniform distribution

Per

form

ance

fun

ctio

n

Updated distribution

starting point


IBM Research

10

The setting in graphs

In graph problems, we have the following:o The space is all paths in the graph Go A performance function f gives each path a valueo We are looking for a path that maximizes f

In each iteration, we choose the best part Q of the sample The probability update formula for an edge e=(v,w) is

f’(e) = #paths in Q that use e

#paths in Q that go via v


IBM Research

11

Cross-entropy for testing

A program is viewed as a graph Each decision point is a node in the graph Decision points can result from any non-deterministic or other not

predetermined decisions:

The performance function is defined according to the bug that we want to find

o More on than later …

concurrency

coin tossing

inputs


IBM Research

12

Our implementation

We focus on concurrent programs. A program under test is represented as a graph, with nodes being the

synchronization points.

Edges are possible transitions between nodes. The graph is assumed to be DAG – all loops are unwound. The graph is constructed on-the-fly during the executions. The initial probability distribution is uniform among edges. We collect a sample of several hundreds executions. We adjust the probabilities of edges according to the formula. We repeat the process until the sample has a very small relative

standard deviation.

1-5%

this works only if there is a correct

locking policy


IBM Research

13

Dealing with loops

Unwinding all loops creates a huge graph. Problems with huge graphs:

o Takes more space to represento Takes more time to converge

We assume that most of the time, we are doing the same thing on subsequent iterations of the loop.

We introduce modulo parameter. It reduces the size of the graph.

dramatically, but also loses information There is a balance between a too-small and

a too-large modulo parameter that is found empirically.

for instance, modulo 2 creates two nodes

for each location inside the loop – for

even and for odd iterations

for i=1 to 100 dosync node;end for

sync nodeodd

sync nodeeven

i mod 2


IBM Research

14

Bugs and performance functions

bug performance function

buffer overflow number of elements in the buffer

deadlock number of locks

data race number of accessed shared resources

testing error paths number of error paths taken

note that we can also test for patterns, not necessarily bugs


IBM Research

15

Implementation – in Java for Java

program under test---------------------------------

Instrumentation

Stopper Decider

Evaluator Updater

probability distribution

table

disk


IBM Research

16

Experimental results We ran ConCEnter on several examples with buffer overflow

and with deadlocks. The bugs were very rare and did not manifest themselves in

random testing. ConCEnter found the bugs successfully. The method requires significant tuning: the modulo parameter,

the smoothing parameter, correct definition of the performance function, etc.

Example: A-B-push-popmyName=A // or B – there are two typesloop: if (top_of_stack=myName) pop;

else push(myName);end loop;

x10

AB

A

thread A

thread A

thread B

thread B

36

the probability of stack overflow is exponentially

small


IBM Research

17

Future work

Automatic tuning. Making ConCEnter plug-and-play for some predefined bugs. Replay: can we use distance from a predefined execution as a

performance function?

Second best: what if there are several areas in the graph where the maximum is reached?

What are the restrictions on the performance function in order for this method to work properly?

works already

seems that the function should be smooth

enough


IBM Research

18

Related work

Testing:o Random testingo Stress testingo Noise makerso Coverage estimationo Bug-specific heuristicso Genetic algorithmso …

Cross-entropy applications:o Buffer allocation, neural computation, DNA sequence

alignment, scheduling, graph problems, …

nothing specifically

targeted to rare bugs

cross-entropy is useful in many areas


IBM Research

19

Date post:	17-Jan-2016
Category:	Documents
Upload:	edgar-harris
View:	218 times
Download:	0 times

IBM Research © 2007 IBM Corporation IBM Haifa Labs © 2006 IBM Corporation Cross-Entropy Based...

Documents