+ All Categories
Home > Documents > Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University

Date post: 20-Mar-2016
Category:
Upload: freya
View: 50 times
Download: 1 times
Share this document with a friend
Description:
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University. Lecture 4 July 13, 2001 LF, Oracle Strings, and Proof Tools. Lipari School on Foundations of Wide Area Network Programming. “Applets, Not Craplets” A Demo. Java binary. Written in OCaml. ~52KB, written in C. - PowerPoint PPT Presentation
106
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 4 July 13, 2001 LF, Oracle Strings, and Proof Tools Lipari School on Foundations of Wide Area Network Programming
Transcript
Page 1: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Mobility, Security, andProof-Carrying Code Peter LeeCarnegie Mellon University

Lecture 4July 13, 2001

LF, Oracle Strings, and Proof Tools

Lipari School on Foundations of Wide Area Network Programming

Page 2: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

“Applets, Not Craplets”

A Demo

Page 3: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Cedilla Systems Architecture

Code producer Host

Ginseng

Native code

Proof

Special J

Java binary

~52KB, written in CWritten in OCaml

Page 4: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Annotations

Cedilla Systems Architecture

Code producer Host

Proof checker

VCGen

Axioms

Native code

Proof

VCSpecial J

Java binary

Page 5: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Annotations

Cedilla Systems Architecture

Code producer Host

Java binary

Proof generator

Proof checker

VCGen

AxiomsAxioms

Certifying compiler

VCGen

VC

Native code

Proof

VC

Page 6: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Java Virtual Machine

JVM

Java Verifier

JNI

Class file Class file

Native codeProof-

carrying code

Checker

Page 7: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Show either the Mandelbrot or NBody3D demo.

Page 8: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Crypto Test Suite Results[Cedilla Systems]

00.10.20.30.40.50.60.70.80.9

IDEA IJCEIns

tallSAFE

R

Base6

4Stre

amLO

KI91

SHA0MD2

SHA1MD4

SPEED DES MD5

DES_E

DE3 RC2Squ

are RC4

UnixCryp

tHAVAL

RIPEM

D128

HMAC

RIPEM

D160

Cedilla Java J I T

sec

On average, 158% faster than Java, 72.8% faster than Java with a JIT.

Page 9: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Java Grande Suite v2.0 [Cedilla Systems]

0

100

200

300

400

500

600

700

crypt fft

heap

sort

lufact

search

series sor

sparse

Cedilla Java J I T

sec

Page 10: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Java Grande Bench Suite [Cedilla Systems]

0500000

100000015000002000000250000030000003500000400000045000005000000

arith assign method

CedillaJavaJ I T

ops

Page 11: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Ginseng

VCGen

CheckerSafety Policy

Dynamic loadingCross-platform

support

~15KB, roughly similar to a KVM verifier (but with floating-point).~4KB, generic.~19KB, declarative and machine-generated.

~22KB, some optional.

Page 12: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Practical Considerations

Page 13: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Trusted Computing Base

The trusted computing base is the software infrastructure that is responsible for ensuring that only safe execution is possible.Obviously, any bugs in the TCB can lead to unsafe execution.Thus, we want the TCB to be simple, as well as fast and small.

Page 14: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGen’s Complexity

Fortunately, we shall see that proofchecking can be quite simple, small, and fast.

VCGen, at core, is also simple and fast.

But in practice it gets to be quite complicated.

Page 15: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGen’s Complexity

Some complications: If dealing with machine code, then

VCGen must parse machine code. Maintaining the assumptions and

current context in a memory-efficient manner is not easy.

Note that Sun’s kVM does verification in a single pass and only 8KB RAM!

Page 16: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VC Explosion

a == b

a == c

f(a,c)

a := x c := x

a := y c := y

a=b => (x=c => safef(y,c) x<>c => safef(x,y))

a<>b => (a=x => safef(y,x) a<>x => safef(a,y))

Exponential growth in size ofthe VC is possible.

Page 17: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VC Explosion

a == b

a == c

f(a,c)

a := x c := x

a := y c := y

INV: P(a,b,c,x)

(a=b => P(x,b,c,x)

a<>b => P(a,b,x,x))

(a’,c’. P(a’,b,c’,x) =>

a’=c’ => safef(y,c’) a’<>c’ => safef(a’,y))

Growth can usually becontrolled by careful placementof just the right “join-point” invariants.

Page 18: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Stack Slots

Each procedure will want to use the stack for local storage.This raises a serious problem because a lot of information is lost by VCGen (such as the value) when data is stored into memory.We avoid this problem by assuming that procedures use up to 256 words of stack as registers.

Page 19: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Exercise

8. Just as with loop invariants, our actual join-point invariants includes a specification of the registers that might be modified since the dominating block.

Why might this be a useful thing to do? Why might it be a bad thing to do?

Page 20: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Callee-save Registers

Standard calling conventions dictate that the contents of some registers be preserved.

These callee-save registers are specified along with the pre/post-conditions for each procedure.

The preservation of their values must be verified at every return instruction.

Page 21: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Introduction to Efficient Representation and Validation of Proofs

Page 22: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

High-Level Architecture

Explanation

CodeVerificationconditiongenerator

Checker

Safetypolicy

Agent

Host

Page 23: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Goals

We would like a representation for proofs that is

compact, fast to check, requires very little memory to check, and is “canonical” (in the sense of

accommodating many different logics without requiring a total reimplementation of the checker).

Page 24: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Three Approaches

1. Direct representation of a logic.

2. Use of a Logical Framework.

3. Oracle strings.

We will reject (1).Today we introduce (2) and (3).

Page 25: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logical Framework

For representation of proofs we use the Edinburgh Logical Framework (LF).

Page 26: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Reynolds’ Example

Skip?

Page 27: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Formal Proofs

Write “x is a proof of P” as x:P.

Examples of predicates P: (for all A, B) A and B => B and A (for all x, y, z) x < y and y < z =>

x < z

What do the proofs look like?

Page 28: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Inference Rules

We can write proofs by stitching together inference rules.An example inference rule:

If we have a proof x of P and a proof y of Q, then x and y together constitute a proof of P Q.

Or, more compactly: if x:P, y:Q then (x,y):P*Q.

A proof, written in our compact notation.

Page 29: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

More Inference Rules

Another inference rule: Assume we have a proof x of P. If we

can then obtain a proof b of Q, then we have a proof of P Q.

if [x:P] b:Q then fn (x:P) => b : P Q.

More rules: if x:P*Q then fst(x):P if y:P*Q then snd(y):Q

Page 30: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Types and Proofs

So, for example:fn (x:P*Q) => (snd(x), fst(x)) : P*Q Q*P

We can develop a full metalanguage based on this principle of “proofs as programs”.

Typechecking gives us proofchecking! Codified in the LF language.

Page 31: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LFi

Skip?

Page 32: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF ExampleThis classic example illustrates how LF is used to represent the terms and rules of a logical system.

Page 33: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF Example in Elf Syntax

exp : typepred : typepf : pred -> type

true : pred/\ : pred -> pred -> pred=> : pred -> pred -> predall : (exp -> pred) -> pred

truei : pf trueandi : {P:pred} {R:pred} pf P -> pf R -> pf (/\ P R)andel : {P:pred} {R:pred} pf (/\ P R) -> pf Pimpi : {P:pred} {R:pred} (pf P -> pf R) -> pf (=> P R)alli : {P:exp -> pred} ({X:exp} pf (P X)) -> pf (all P)alle : {P:exp -> pred} {E:exp} pf (all P) -> pf (P E)

The same example, but using Pfenning’s Elf syntax.

Page 34: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF as a Proof Representation

LF is canonical, in that a single typechecker for LF can serve as a proofchecker for many different logics specified in LF. [See Avron, et al. ‘92]

But the efficiency of the representation is poor.

Page 35: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Size of LF Representation

Proofs in LF are extremely large, due to large amounts of repetition.

Consider the representation of P P P for some predicate P:

The proof of this predicate has the following LF representation:

(=> P (/\ P P))

(impi P (/\ P P) ([X:pf P] andi P P x x))

Page 36: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Checking LF

The nice thing is that typechecking

is enough for proofchecking. [The theorem is in the LF paper.]

But the proofs are extremely large.

(impi P (/\ P P) ([X:pf P] andi P P X X)) : pf (=> P (/\ P P))

Page 37: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Implicit LF

A dramatic improvement can be achieved by using a variant of LF, called Implicit LF, or LFi.

In LFi, parts of the proof can be replaced by placeholders.

(impi * * ([X:*] andi * * X X)) : pf (=> P (/\ P P))

Page 38: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Soundness of LFi

The soundness of the LFi type system is given by a theorem that states:

If, in context , a term M has type A in LFi (and and A are placeholder-free), then there is a term M’ such that M’ has type A in LF.

Page 39: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Typechecking LFi

The typechecking algorithm for LFi is given in [Necula & Lee, LICS98].

A key aspect of the algorithm is that it avoids repeated typechecking of reconstructed terms.

Hence, the placeholders save not only space, but also time.

Page 40: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Effectiveness of LFi

In experiments with PCC, LFi leads to substantial reductions in proof size and checking time.

Improvements increase nonlinearly with proof size.

Experiment Proof size (bytes) Checking time (ms)LF LFi LF LFi

unpack >10 x 106 23728 8256 42simplex >2 x 106 23888 1656 42sharpen 183444 4816 136 7qsort 92412 3098 74 6kmp 77246 2092 60 3bcopy 12466 796 11 1

Page 41: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Need for Improvement

Despite the great improvement of LFi, in our experiments we observe that LFi proofs are 10%-200% the size of the code.

Page 42: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

How Big is a Proof?

A basic question is how much essential information is in a proof?

In this proof,

there are only 2 uses of rules and in each case they were the only rule that could have been used.

(impi * * ([X:*] andi * * x x)) : pf (=> P (/\ P P))

Page 43: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Improving the Representation

We will now improve on the compactness of proof representation by making use of the observation that large parts of proofs are deterministically generated from the inference rules.

Page 44: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Additional References

For LF: Harper, Honsell, & Plotkin. A

framework for defining logics. Journal of the ACM, 40(1), 143-184, Jan. 1993.

Avron, Honsell, Mason, & Pollack. Using typed lambda calculus to implement formal systems on a machine. Journal of Automated Reasoning, 9(3), 309-354, 1992.

Page 45: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Additional References

For Elf: Pfenning. Logic programming in the LF

logical framework. Logical Frameworks, Huet & Plotkin (Eds.), 149-181, Cambridge Univ. Press, 1991.

Pfenning. Elf: A meta-language for deductive systems (system description). 12th International Conference on Automated Deduction, LNAI 814, 811-815, 1994.

Page 46: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Oracle-Based Checking

Page 47: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Necula’s ExampleSyntax of Girard’s System F

ty : typeint : tyarr : ty -> ty -> tyall : (ty -> ty) -> ty exp : typez : exps : exp -> explam : (exp -> exp) -> expapp : exp -> exp -> exp

of : exp -> ty -> type

Page 48: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Necula’s ExampleTyping Rules for System F

tz : of z int

ts : {E:exp} of E int -> of (s E) int

tlam : {E:exp->exp} {T1:ty} {T2:ty} ({X:exp} of X T1 -> of (E X) T2) -> of (lam E) (arr T1 T2)

tapp : {E1:exp} {E2:exp} {T:ty} {T2:ty} of E1 (arr T2 T) -> of E2 T2 -> of (app E1 E2) T

tgen : {E:exp} {T:ty->ty} ({T1:ty} of E (T T1)) -> of E (all T)

tins : {E:exp} {T:ty->ty} {T1:ty} of E (all T) -> of E (T T1)

Page 49: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF Representation

Consider the lambda expression

It is represented in LF as follows:

(f.(f x.x) (f 0)) y.y

app (lam [F:exp] app (app F (lam [X:exp] X)) (app F 0)) (lam [Y:exp] Y)

Page 50: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Necula’s Example

Now suppose that this term is an applet, with the safety policy that all applets must be well-typed in System F.

One way to make a PCC is to attach a typing derivation to the term.

Page 51: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Typing Derivation in LF(tapp (lam [F:exp] (app (app F (lam [X:exp] X)) (app F 0))) (lam ([X:exp] X)) (all ([T:ty] arr T T)) int (tlam (all ([T:ty] arr T T)) int ([F:exp] (app (app F (lam [X:exp] X)) (app F 0))) ([F:exp][FT:of F (all ([T:ty] arr T T))] (tapp (app F (lam [X:exp] X)) (app F 0) int int (tapp F (lam [X:exp] X) (arr int int) (arr int int) (tins F ([T:ty] arr T T) (arr int int) FT) (tlam int int ([X:exp] X) ([X:exp][XT:of X int] XT))) (tapp F 0 int int (tins F ([T:ty] arr T T) int FT) t0)))) (tgen (lam [Y:exp] Y) ([T:ty] arr T T) ([T:ty] (tlam T T ([Y:exp] Y) ([Y:exp] [YT:of Y T] YT)))))

Page 52: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Typing Derivation in LFi

(tapp * * (all ([T:*] arr T T)) int (tlam * * * ([F:*][FT:of F (all ([T:ty] arr T T))] (tapp * * int (tapp * * (arr int int) (arr int int) (tins * * * FT) (tlam * * * ([X:*][XT:*] XT))) (tapp * * int int (tins * * * FT) t0)))) (tgen * * ([T:*] (tlam * * * ([Y:*] [YT:*] YT)))))

I think. I did this by hand!

Page 53: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF Representation

Using 16 bits per token, the LF representation of the typing derivation requires over 2,200 bits.

The LFi representation requires about 700 bits.

(The term itself requires only about 360 bits.)

Skip ahead

Page 54: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

A Bit More about LFi

To convert an LF term into an LFi term, a representation algorithm is used. [Necula&Lee, LICS98]

Intuition: When typechecking a term: c M1 M2 … Mn : A (in a context )

we know, if A has no placeholders, that some of the M1…Mn may appear in A.

Page 55: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

A Bit More about LFi, cont’d

For example, when the rule

is applied at top level, the first two arguments are present in the term

and thus can be elided.

tapp : {E1:exp} {E2:exp} {T:ty} {T2:ty} of E1 (arr T2 T) -> of E2 T2 -> of (app E1 E2) T

app (lam [F:exp] app (app F (lam [X:exp] X)) (app F 0)) (lam [Y:exp] Y)

Page 56: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

A Bit More about LFi, cont’d

A similar trick works at lower levels by relying on the fact that typing constraints are solved in a certain order (e.g., right-to-left).

See the paper for complete details.

Page 57: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Can We Do Better?

tz : of z int

ts : {E:exp} of E int -> of (s E) int

tlam : {E:exp->exp} {T1:ty} {T2:ty} ({X:exp} of X T1 -> of (E X) T2) -> of (lam E) (arr T1 T2)

tapp : {E1:exp} {E2:exp} {T:ty} {T2:ty} of E1 (arr T2 T) -> of E2 T2 -> of (app E1 E2) T

tgen : {E:exp} {T:ty->ty} ({T1:ty} of E (T T1)) -> of E (all T)

tins : {E:exp} {T:ty->ty} {T1:ty} of E (all T) -> of E (T T1)

Page 58: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Determinism

Looking carefully at the typing rules, we observe:

For any typing goal where the term is known but the type is not:

3 possibilities: tgen, tins, other.

If type structure is also known, only 2 choices, tapp or other.

Page 59: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

How MuchEssential Information?

(tapp (lam [F:exp] (app (app F (lam [X:exp] X)) (app F 0))) (lam ([X:exp] X)) (all ([T:ty] arr T T)) int (tlam (all ([T:ty] arr T T)) int ([F:exp] (app (app F (lam [X:exp] X)) (app F 0))) ([F:exp][FT:of F (all ([T:ty] arr T T))] (tapp (app F (lam [X:exp] X)) (app F 0) int int (tapp F (lam [X:exp] X) (arr int int) (arr int int) (tins F ([T:ty] arr T T) (arr int int) FT) (tlam int int ([X:exp] X) ([X:exp][XT:of X int] XT))) (tapp F 0 int int (tins F ([T:ty] arr T T) int FT) t0)))) (tgen (lam [Y:exp] Y) ([T:ty] arr T T) ([T:ty] (tlam T T ([Y:exp] Y) ([Y:exp] [YT:of Y T] YT)))))

Page 60: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

How MuchEssential Information?

There are 15 applications of rules in this derivation.

So, conservatively: log2 3 15 = 30 bits

In other words, 30 bits should be enough to encode the choices made by a type inference engine for this term.

Page 61: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Oracle-based Checking

Idea: Implement the proofchecker as a nondeterministic logic interpreter whose

program consists of the derivation rules, and

initial goal is the judgment to be verified.

We will avoid backtracking by relying on the oracle string.

Skip ahead

Page 62: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Why Higher-Order?

The syntax of VCs for the Java type-safety policy is as follows:

The LF encodings are simple Horn clauses (and requiring only first-order unification). Higher- order features only for implication and universal quantification.

E ::= x | c E1 … En

F ::= true | F1 F2 | x.F | E | E F

Page 63: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Why Higher-Order?

Perhaps first-order Horn logic (or perhaps first-order hereditary Harrop formulas) is enough.

Indeed, first-order expressions and formulas seem to be enough for the VCs in type-safety policies.

However, higher-order and modal logics would require higher-order features.

Page 64: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

A SimplificationA Fragment of LF

Level-0 types. A ::= a | A1 A2

Level-1 types (-normal form). B ::= a M1 … Mn | B1 B2 | x:A.B

Level-0 kinds. K ::= Type | A K

Level-0 terms (-normal form). M ::= x:A.M | c M1 … Mn | x M1 … Mn

Page 65: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF Fragment

This fragment simplifies matters considerably, without restricting the application to PCC.

Level-0 types to encode syntax. Level-1 types to encode derivations.

No level-1 terms since we never reconstruct a derivation, only verify that one exists!

Page 66: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

LF Fragment, cont’d

ty : typeexp : type

of : exp -> ty -> type

Level-0 types.

Level-1 type family.

Disallowing level-2 and higher type families seems not to have any practical impact.

Page 67: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic InterpreterGoals

G ::= B | M = M’ | x:B.G | x:A.G

| T | G1 G2

.

For Necula’s example, the interpreter will be started with the goal

t:ty. of E t

Page 68: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Naïve Interpreter

solve(B1 B2) = x:B1. solve(B2)

solve(x:A.B) = x:A. solve(B)solve(a M1 … Mn) = subgoals(B, a M1 … Mn) where B is the type of a level-1 constant or a level-1 quantified variable (in scope), as selected by the oracle.

subgoals(B1 B2, B) = x:B1. solve(B2)

subgoals(x:A.B’, B) = x:A. solve(B)subgoals(a M1’ … Mn’, a M1 … Mn) = M1 = M1’ … Mn = Mn’

Page 69: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Necula’s Example

Consider solve(of E t)

This consults the oracle.

Since there are 3 level-1 constants that could be used at this point, 2 bits are fetched from the oracle string (to select tapp).

Page 70: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Higher-Order Unification

The unification goals that remain after solve are higher-order and thus only semi-decidable.

A nondeterministic unification procedure (also driven by the oracle string) is used.

Some standard LP optimizations are also used.

Page 71: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Certifying Theorem Proving

Page 72: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Certifying Theorem Proving

Time does not allow a description here.See:

Necula and Lee. Proof generation in the Touchstone theorem prover. CADE’00.

Of particular interest: Proof-generating congruence-closure

and simplex algorithms.

Page 73: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Certifying Compilation

Skip ahead

Page 74: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Basic Trick

Recall the bcopy program:public class Bcopy { public static void bcopy(int[] src,

int[] dst) { int l = src.length; int i = 0;

for(i=0; i<l; i++) { dst[i] = src[i]; } }}

Page 75: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Unoptimized Loop BodyL11 :

movl 4(%ebx), %eaxcmpl %eax, %edxjae L24

L17 :cmpl $0, 12(%ebp)movl 8(%ebx, %edx, 4), %esije L21

L20 :movl 12(%ebp), %edimovl 4(%edi), %eaxcmpl %eax, %edxjae L24

L23 :movl %esi, 8(%edi, %edx, 4)movl %edi, 12(%ebp)incl %edx

L9 :ANN_INV(ANN_DOM_LOOP,

%LF_(/\ (of rm mem ) (of loc1 (jarray jint) ))%_LF,RB(EBP,EBX,ECX,ESP,FTOP,LOC4,LOC3))cmpl %ecx, %edxjl L11

Bounds check on src.

Bounds check on dst.

Note: L24 raises the ArrayIndex exception.

Page 76: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Unoptimized Code is Easy

In the absence of optimizations, proving the safety of array accesses is relatively easy.Indeed, in this case it is reasonable for VCGen to verify the safety of the array accesses.As the optimizer becomes more successful, verification gets harder.

Page 77: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Role of Loop Invariants

It is for this reason that the optimizer’s knowledge must be conveyed to the theorem prover.

Essentially, any facts about program values that were used to perform and code-motion optimizations must be declared in an invariant.

Page 78: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Optimized Loop Body

L7:ANN_LOOP(INV = {

(csubneq ebx 0),(csubneq eax 0),(csubb edx ecx),(of rm mem)},

MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM))cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edx

Essential facts about live variables, used by the compiler to eliminate bounds-checks in the loop body.

Page 79: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Certifying Compiling andProving

Intuitively, we will arrange for the Prover to be at least as powerful as the Compiler’s optimizer.Hence, we will expect the Prover to be able to “reverse engineer” the reasoning process that led to the given machine code.An informal concept, needing a formal understanding!

Page 80: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

What is Safety, Anyway?

If the compiler fails to optimize away a bounds-check, it will insert code to perform the check.

This means that programs may still abort at run-time, albeit with a well-defined exception.

Is this safe behavior?

Page 81: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University
Page 82: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Resource Constraints

Bounds on certain resources can be enforced via counting.

In a Reference Intepreter: Maintain a global counter. Increment the count for each

instruction executed. Verify for each instruction that the

limit is not exceeded. Use the compiler to optimize away

the counting operations.

Page 83: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Compiler as aTheorem-Proving Front-End

The compiler is essentially a user-interface to a theorem prover.

The possibilities for interactive use of compilers have been described in the literature but not developed very far.

PCC may extend the possibilities.

Page 84: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Developing PCC Tools

Page 85: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Compiler Development

The PCC infrastructure catches many (probably most) compiler bugs early.

Our standard regression test does not execute the object code!

Principle: Most compiler bugs show up as safety violations.

Page 86: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example Bug

… L42: movl 4(%eax), %edx

testl %edx, %edxjle L47

L46: … set up for loop … L44: … enter main loop code …

…jl L44jmp L32

L47: fldzfldz

L32: … return sequence …ret

Page 87: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example Bug

… L42: movl 4(%eax), %edx

testl %edx, %edxjle L47

L46: … set up for loop … L44: … enter main loop code …

…jl L44jmp L32

L47: fldz

L32: … return sequence …ret

Error in rarely executed compensation code is caught by the Proof Generator.

Page 88: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Another Example Bug

Suppose bcopy’s inner loop is changed:

L7: ANN_LOOP( … )cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edxjl L7ret

Page 89: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Another Example Bug

Suppose bcopy’s inner loop is changed:

L7: ANN_LOOP( … )cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)addl 2, %edxcmpl %ecx, %edxjl L7ret

Again, PCC spots the danger.

Page 90: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Yet Another

class Floatexc extends Exception {

public static int f(int x) throws Floatexc { return x;} public static int g(int x) { return x;}

public static float handleit (int x, int y) {float fl=0;try { x=f(x); fl=1; y=f(y);}catch (Floatexc b) { fl+=fl; }return fl;

}}

Page 91: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Yet Another…Install handler…

pushl $_6except8Floatexc_Ccall __Jv_InitClassaddl $4, %esp

…Enter try block…L17:

movl $0, -4(%ebp)pushl 8(%ebp)call _6except8Floatexc_MfIaddl $4, %espmovl %eax, %ecx

……A handler…L22:

flds -4(%ebp)fadds -4(%ebp)jmp L18

… To the end

Page 92: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Why PCC May be a Reasonable Idea

Page 93: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Ten Good Things About PCC

1. Someone else does all the really hard work.

2. The host system changes very little.

...

Page 94: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic as a lingua franca

CertifyingProver

CPU

Code

ProofProofEngine

Page 95: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic as a lingua franca

CertifyingProver

CPU

ProofProofChecker

Policy

VC

Code

Language/compiler/machine dependences isolated from the proof checker.

Expressed as predicates and derivations in a formal logic.

Page 96: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic as a lingua franca

CertifyingProver

CPU

…iaddiaload...

ProofProofChecker

Policy

VC

Code can be in any languageonce a Safety Policy is supplied.

Page 97: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic as a lingua franca

CertifyingProver

CPU

…addl %eax,%ebxtestl %ecx,%ecxjz NULLPTRmovl 4(%ecx),%edxcmpl %edx,%ebxjae ARRAYBNDSmovl 8(%ecx.%ebx.4).%edx...

ProofProofChecker

Policy

VC

…addl %eax, %testl %ecx,%ejz NULLPTRmovl 4(%ecx),%cmpl %edx,%ebjae ARRAYBNDmovl 8(%ecx.

Adequacy of dynamic checksand “wrappers” can be verified.

Page 98: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Logic as a lingua franca

CertifyingProver

CPU

…add %eax,%ebxmovl 8(%ecx,%ebx,4)...

ProofProofChecker

Policy

VC

Safety of optimized codecan be verified.

Page 99: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Ten Good Things About PCC

3. You choose the language.

4. Optimized (“unsafe”) code is OK.

5. Verifies that your optimizer and dynamic checks are OK.

Page 100: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Role ofProgramming Languages

Civilized programming languages can provide “safety for free”.

Well-formed/well-typed safe.

Idea: Arrange for the compiler to “explain” why the target code it generates preserves the safety properties of the source program.

Page 101: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Certifying Compilers[Necula & Lee, PLDI’98]

Intuition: Compiler “knows” why each translation

step is semantics-preserving. So, have it generate a proof that safety is

preserved. “Small theorems about big programs.”

Don’t try to verify the whole compiler, but only each output it generates.

Page 102: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Automation viaCertifying Compilation

CertifyingCompiler

CPU

ProofChecker

Policy

VC

Sourcecode

Proof

Objectcode

Looks and smells like a compiler.% spjc foo.java bar.class baz.c -ljdk1.2.2

Page 103: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Ten Good Things About PCC

6. Can sometimes be easy-to-use.

7. You can still be a “hero theorem hacker” if you want.

...

Page 104: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Ten Good Things About PCC

8. Proofs are a “semantic checksum”.

9. Possibility for richer safety policies.

10. Co-exists peacefully with crypto.

Page 105: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Acknowledgments

George Necula.

Robert Harper and Frank Pfenning.

Mark Plesko, Fred Blau, and John Gregorski.

Page 106: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Microsoft’sActiveF Technology

y o uWANT TO prove TODAy?

WHaT DO


Recommended