Beyond Reachability: Shape Abstraction in the presence of Pointer Arithmetic
Hongseok Yang(Queen Mary, University of London)
(Joint work with Dino Distefano, Cristiano Calcagno and Peter O’Hearn)
Dream
Automatically verify the memory safety of systems code, such as device derivers and memory managers.
Challenges: 1. Pointer arithmetic.2. Scalability.3. Concurrency.
Our Analyzer Handles programs for dynamic memory
management. Experimental results (Pentium
1.6GHz,512MB)
Proved memory safety and even partial correctness.
Found a hidden assumption of the K&R memory manager. These are “fixed” versions.
Hidden Assumption in K&R Malloc/Free
0 220
Global Vars Stack Heap
Hidden Assumption in K&R Malloc/Free
0 220
Global Vars Stack Heap
Hidden Assumption in K&R Malloc/Free
0 220
Global Vars Stack Heap
Hidden Assumption in K&R Malloc/Free
0 220
Global Vars Stack Heap
Hidden Assumption in K&R Malloc/Free
0 220
Global VarsStack Heap
Sample Analysis Result
Program: ans = malloc_bestfit_acyclic(n);Precondition: n¸2 Æ mls(freep,0)
Postcondition: (ans=0 Æ n¸2 Æ mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,0)) Ç(n¸2 Æ nd(ans,q’,n) * mls(freep,q’) * mls(q’,0))
Rice Theorem
Determining any nontrivial property of programs is not decidable.
Multiword Lists
24
515 3 18 3 nil 2
lp 15 18
24
Link Field Size Field
Coalescing
24 515 3 18 3 nil 25 15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
p
Coalescing
24 515 3 18 3 nil 2
15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p
Coalescing
24 515 3 18 3 nil 2
15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p q
Coalescing
24 515 3 18 3 nil 2
15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p q
Coalescing
24 515 3 18 8 nil 2
15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p q
Coalescing
24 515 3 24 8 nil 2
15 18 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p q
Coalescing
15 3 24 8 nil 2
15 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p
Coalescing
15 3 24 8 nil 2
15 24
p = lp; while (p!=0) { local q = *p; if (p + *(p+1) == q) { *(p+1) = *(p+1) + *(q+1); *p = *q; } else { p = q; } }
5p=0
Nodeful High-level View
Nodeful High-level View
Nodeless Low-level View
Complex numerical relationships are used only for reconstructing a high-
level view.
Separation Logic blk(p+2,p+5)
nd(p,q,5) =def (pq) * (p+15) * blk(p+2,p+5)
mls(p,q)
p+2 p+5
p+5
5q
p
3 4 2
qp
Program Analysis
Collecting, approximate execution of programs that always terminates.
while(x < y) {
x = x+1;
}
x:1,y:2
x:2,y:2
x:2,y:4
x:4,y:4
x:+,y:+
x:+,y:+
x:+,y:+,x<y
x:+,y:+
x:+,y:+,x>=y
x:+,y:+
Our Analysis
while(B) { C;
}
{T1,T2,…,Tn}
{ T’1,T’2,…,T’m}
{Q1, Q2, … ,Qn}
Nodeful View:
P(CanSymH)
Nodeless View:
P(SymH)
{Q’1, Q’2, … ,Q’m}
Rearrangement
Abstraction
Sym. Executiony=x+z Æ x y*x+1 z*blk(x+2,0)*mls(y,0)
nd(x,y,z) * mls(y,0)
Our Analysis
while(B) { C;
}
{T1,T2,…,Tn}
{ T’1,T’2,…,T’m}
{Q1, Q2, … ,Qn}
Nodeful View:
P(CanSymH)
Nodeless View:
P(SymH)
{Q’1, Q’2, … ,Q’m}
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(p q’ * p+1 3 * blk(p+2,z’) * mls(q’,0))
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists.(5 · x+x Æ p+3=z’) Æ(nd(p,q’,3) * mls(q’,0))
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. (nd(p,q’,3) * mls(q’,0))
Abstraction Function Abs
Abs : SymH ! CanSymH
1. Package all nodes.2. Drop numerical relationships.3. Combine two connected multiword lists. mls(p,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*q’r’,t’*blk(q’+2,q’+t’)*mls(r’,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=q’Æmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*nd(q’,r’,t’) *mls(r’,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=q’Æmls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
mls(lp,p)*nd(p,r’,s’+t’)* *mls(r’,0)
Coalescing while (p!=0){local q=p*;
if (p + *(p+1) == q) {
*(p+1) = *(p+1) + *(q+1);
*p = *q; } else {
p = *p;
} }
mls(lp,p) * mls(p,0)…
p!=0 Æ mls(lp,p) * p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0 Æ p+s’=q Æ mls(lp,p)*p q,s’ * blk(p+2,p+s’) * mls(q,0)
p!=0Æp+s’=qÆmls(lp,p)* pq,s’+t’ * blk(p+2,p+s’) *qr’,t’*blk(q+2,q+t’)*mls(r’,0)
p!=0Æp+s’=qÆmls(lp,p)*pr’,s’+t’*blk(p+2,p+s’)*qr’,t’*blk(q+2,q+t’)*mls(r’,0)
mls(lp,p)*mls(p,0)
Soundness
Analysis results can be compiled into separation-logic proofs.
Conclusion: Analysis Design
1. Pick a class of target programs.2. Observe properties of target programs.3. Design a computable approximate semantics