Static Deadlock Detection in the Linux Kernel (RST 2004)

Static Deadlock Detection in the Linux Kernel

Peter T. Breuer & Marisol Garcia Valls

Universidad Carlos III de Madrid

Reliable Software Technologies – Ada Europe 2004

Formal Methods & Open Source

Do they mix? not very much so far!

How can formal methods help open source?

Write simple static analysis tools, which use C, perl, sh, make and nothing else. Open!

Frightening naivete of the theorists

"For example in p. 13 you claim that the kernel does not treat an infinite number of user request[s] between DiskTQ events. This is not completely true ...

The challenge of quantity in the Linux kernel

3.5-4 million LOC 100-1000s of people working on it 10-100 patches a day LKML 200-500 messages a day 15 different architectures SMP/NUMA/HiMEM flavours are different Maybe 600 compilation options Open development model

Chances are high someone just ruined your assumptions!

Code Fitness

Source code is invented to fill a niche but survives only if it can adapt to an ever-evolving context

Corollary open code must have been legible, if it has attracted

coders

Corollary code must have evolved good practices that have

protect it from its ever-changing environment

Codes of practice

Thou shalt not

sleep in an interrupt take the i/o lock in a request handler take the same spinlock twice take two different spinlocks in different orders, ever request memory in a request handler use low-level memory flags and structs

Bad news - sleep under spinlock Common coding error

code takes spinlock, then ...

... code schedules itself or is scheduled out of cpu, ...

... another thread comes in and spins against spinlock only thread that will release spinlock is not running!

Authors commit the mistake by mistake

through using an opaque macro

ignorance

Prototype Static Analysis Tool

detect "sleep under spinlock" perform analysis compositionally

determine when piece of code is spinlocked determine when function may sleep

call sleepy function in spinlocked region? Runtime tests are available, but

kernel needs to be specially compiled

people who observe a problem rarely are the author

observers rarely know what they are observing

This is simple: Sleepy functions

A function may sleep if it calls another function that it has been determined may sleep

one of a set of basically sleepy functions schedule() kmalloc() wait_on_buffer() ...

transitive closure complicated by resolving local vs global references

This is difficult: Spinlock Analysis

Based on "3-phase" logic code a ; b has both normal and exceptional flows

a b

exception exception

standard

normal

precondition/postcondition {p} a {n, e} { p } a {i, e} ; { i } b {n, e} => {p } a;b {n, e}

2 (or 3?) programming semantics normal flow N

"falling off the end of a piece of code"

exceptional flow to a return R exceptional flow to a break B { p } a {n, e } iff { p } Na { n } & {p } Ex a {e}

exception exception

ba

ExEx

N

Semantics of sequence

Normal flow 1st terminates 2nd begins

{p} Na {i} & {i} Nb {q} => {p} N(a;b) {q}

Exceptional flow exception in the 1st or 1st terminates normally and exception in 2nd

{p} Ex a {e} & {p} N a {i} & {i} Ex b {e}=> {p} Ex(a; b) {e}

Application - counting spinlocks {p} == {spinlock count = n }

Counting the spinlock imbalance

Normal flow N(spin_lock(x)) = +1

N(spin_unlock(x)) = -1

cannot break or return from spin_[un]lock call N (a ; b) = N(a) + N(b) conservative overestimate

careful accounting keeps it accurate

Simplifying Assumption There is at most one spinlock imbalance in any code

considered, and it is positive effectively every loop is tranversed only once

spin_lock();while (1) { ... if (...) { spin_unlock(); break; } ...}

balanced

if the imbalance is in the loop, only need 1 traversal to see it

if it is outside the loop, the loop counts 0 imbalance

Imbalance in a subroutine Ra (the flow to a return) is the imbalance in a routine R(a;b) = max(Ra, Na + Rb)

R(if a else b) = max(Ra, Rb)R(while a b) = Rb R(return x) = 0

Normal flow across infinite loop is break flowN(while (1) b) = B(b)N(if a then b) = max (Na, Nb)

B(a;b) = max(Ba, Na + Bb)B(if a then b) = max(Ba, Bb)B(break) = 0

Spotting sleep under spinlock

As the calculation of Ra proceeds the spinlock imbalance from the function entry point

to every point in the code is calculated

if a call to a sleepy function occurs when the imbalance is positive ...

... sleep under spinlock is detected

In Practice ...

592 line driver expands to 28165 lines with includes, macro expansions

analyser detects 10000 refs to functions and local variables

finds 6 sleepy functions in include chain

finds 3 sleepy driver functions

swap 2 lines in driver and flags sleep under spinlock doesn't spot that one sleepy driver function would be called from the

kernel under spinlock (not given the caller to analyse) would miss calls made through dynamically assigned methods

Future tests

Spinlock under spinlock spinlock A, spinlock B

must always be taken in the same order

different authors may not do so!

need extra type information on spinlocks spinlock A is used under spinlock B spinlock B may not be used under spinlock A

Summary

C code analyser detects spinlock abuses in the Linux kernel

cheap - needs C compiler + Makefile

low barrier to use Better 1% useful to 1000 authors than 100% useful to 10

"real C" is significantly difficult to handle!

Postscript Putting formal methods into open source does not require

the use of economic arguments - no money! The formal methods must use cheap, available tools and

platform. They must be open, so that people can get interested in

them and improve them. Must be open so that people can maintain them as the

world changes They must save debugging time, not development...

Date post:	18-Nov-2014
Category:	Education
Upload:	peter-breuer
View:	472 times
Download:	1 times

Static Deadlock Detection in the Linux Kernel (RST 2004)

Education