A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs

A Randomized Scheduler with Probabilistic

Guarantees of Finding Bugs

Sebastian BurckhardtMicrosoft Research

Pravesh KothariIndian Institute of Technology,

Kanpur

Santosh NagarakatteUniversity of Pennsylvania

Madanlal MusuvathiMicrosoft Research

What is Concurrency Testing?Whether a test finds a bug depends on

◦ the configuration◦ the inputs◦ the schedule

Concurrency bugs are bugs that surface only for some schedules

The Concurrency Testing Problem◦ How to cover buggy schedules as best we

can?◦ Testing all schedules is infeasible!

Idea: Randomize the Schedule

void* p = 0;

CreateThd(child)

;

p = malloc(…);

Init();

DoMoreWork();

p->f ++;

Parent Child

1. Instrument code with calls to insert random delays

2. If we are lucky, delay exposes bugs

3. But: how long to delay? where not to delay?

void* p = 0;

RandDelay();

CreateThd(child

);

RandDelay();

p = malloc(…);

Init();

RandDelay();

DoMoreWork();

RandDelay();

p->f ++;

void* p = 0;

RandDelay();

CreateThd(child

);

RandDelay();

p = malloc(…);

Init();

RandDelay();

DoMoreWork();

RandDelay();

p->f ++;

void* p = 0;

RandDelay();

Start(child);

RandDelay();

p = malloc(…);

Init();

RandDelay();

DoMoreWork();

RandDelay();

p->f ++;

What is a Randomized Algorithm?

A randomized algorithm:◦“An algorithm that makes nondeterministic

choices”◦An algorithm using a random source

with a precisely defined distribution

A probabilistic guarantee:◦ “A guarantee that doesn’t always hold”◦A lower bound on the probability of

success

What we did / Talk Outline1. Define bug depth in such a way that

common bugs have low depth

2. Develop PCT algorithm (probabilistic concurrency testing), a randomized scheduling algorithm with a good probabilistic guarantee to find bugs of low depth

3. Build it into Cuzz, a concurrency fuzzing tool that improves the efficiency of stress testing

Madan Musuvathi

The slide design is not showing well on my computer - you probably dont care at this point.For instance, the sub bullet after bullet 2 does not show well.

Madan Musuvathi

replace "shallow bugs" with "bugs with smaller depth"

BUG DEPTHPart I

Bug Depth

Bug Depth = the number of ordering constraints a schedule has to satisfy to find the bug.

More constraints means more things have to go “just right” to find the bug.

Conjecture: many typical bugs have low depth.Let’s look at 3 examples.

Ordering Violation Example: A Bug of Depth 1Bug depth = the number of ordering constraints sufficient to find the

bug.

All schedules that satisfy the “” find the bug.

…start(child);p = malloc();…

Parent Thread…do_init();p->f ++;…

Child Thread

Atomicity Violation Example: A Bug of Depth 2Bug depth = the number of ordering constraints sufficient to find the

bug.

All schedules that satisfy both “” find the bug.

p = malloc();start(child);…If (p != null) p->f++…

Parent Thread…p = null;…

Child Thread

Deadlock Example: A Bug of Depth 2Bug depth = the number of ordering constraints sufficient to find the

bug.

All schedules that satisfy both “” find the bug.

…Lock(A);…Lock(B);…

Parent Thread…Lock(B);…Lock(A);…

Child Thread

THE PCT ALGORITHMPart II

PCT Algorithm: Randomly Assign & Change Thread Priorities

Input: int k; // no. of steps - guessed from previous runs int d; // target bug depth - randomly chosen

State: int pri[]; // thread priorities int change[]; // when to change priorities int stepCnt; // current step count

PCT::Init() {

stepCnt = 0;

foreach tid pri[tid] = rand() + d;

for( i=0; i<d-1; i++ ) change[i] = rand() % k;

}

PCT::RandDelay( tid ) {

stepCnt ++; if stepCnt == change[i] for some i pri[tid] = i; if (tid is not highest pri enabled thread) spin;

}

The PCT GuaranteeGiven a program with

◦n threads (~tens)◦k steps (~millions)◦a bug of depth d (1,2)

Each run PCT finds the bug with a probability of at least

(this is a worst-case guarantee)

1

1 dkn

p

THE CUZZ TOOL& RESULTS

Part III

How it Works

Intercept at synchronization points◦ Detour win32 synchronization calls◦ Optionally instrument data accesses◦ No manual instrumentation required

Program

Kernel Scheduler

Win32 API

CuzzRandomizedAlgorithm

binary instrumentationfor data accesses(optional)

Some Results

Practice Beats Worst-CaseMeasured Probability often

significantly better than worst-case guaranteed probability

Why Does Practice Beat Worst-Case?Worst-case guarantee applies to

hardest-to-find bug of given depth If bugs can be found in multiple ways,

probabilities add up!Example: Increasing the number of threads

helps:

2 3 5 9 17 33 650

0.0020.0040.0060.0080.01

0.0120.0140.0160.0180.02

Number of Threads

Mea

sure

d Pr

obab

ility

Internal Tool Status

The Cuzz tool is available internally at Microsoft

We are working with several product groups that actively use Cuzz to improve their stress testing

DEMO

Demo ConclusionMeasure probabilities on cluster

◦Without Cuzz: 1 Fail in 238’820 runs ratio = 0.000004817

◦With Cuzz: 12 Fails in 320 runs ratio = 0.0375

◦Resource Savings: factor 7,800

1 day of stress testing = 11 seconds of Cuzz testing

Madan Musuvathi

Emphasize the point that the worst-case bound is much lower than 0.0375

ConclusionsBug depth is a useful metric to

focus testing effortsSystematic randomization

improves concurrency testingNo reason not to use Cuzz

Thank You For Your

Attention.

Date post:	21-Feb-2016
Category:	Documents
Upload:	doctor
View:	33 times
Download:	0 times

A Randomized Scheduler with Probabilistic Guarantees of Finding Bugs

Documents