mobility, security, and proof-carrying code peter lee carnegie mellon university

48
Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University Lecture 3 July 12, 2001 VC Generation and Proof Representation Lipari School on Foundations of Wide Area Network Programming

Upload: lynn

Post on 18-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Mobility, Security, and Proof-Carrying Code Peter Lee Carnegie Mellon University. Lecture 3 July 12, 2001 VC Generation and Proof Representation. Lipari School on Foundations of Wide Area Network Programming. Whew!. Recap. When the host system receives certified code, it - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Mobility, Security, andProof-Carrying Code

Peter LeeCarnegie Mellon University

Lecture 3

July 12, 2001

VC Generation and Proof Representation

Lipari School on Foundations of Wide Area Network Programming

Page 2: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Whew!

Page 3: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Recap

When the host system receives certified code, it

inspects the code, generating verification conditions (VCs), and

finds a proof for each VC (if it can).

[Abstractly, one thinks of generating a single predicate, which is the conjunction of all the VCs.]

Generation of VCs is done relative to a safety policy.

Page 4: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

High-Level Architecture

Explanation

CodeVerificationconditiongenerator

Checker

Safetypolicy

Agent

Host

Page 5: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

What Is a “Safety Policy”?

Yesterday, we gave the intuition of a reference interpreter that aborts the program just prior to any unsafe operation.

In this case, the reference interpreter essentially defines the safety policy.

Page 6: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety Policies

More formally, we begin by defining the small-step operational semantics of a machine, call it the s86.

, , pc instr ’, pc’

We define the machine so that only safe executions are defined.

program

register state

program counter

Page 7: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety Policies, cont’d

For convenience we choose the s86 to be a restriction of the x86.

Hence all s86 programs will execute faithfully on a real x86.

The goal then is to prove that any given program always makes progress (or returns) in the s86.

With such a proof, the x86 is then just as good as an s86.

Page 8: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Verification Conditions

The point of the verification conditions, then, is to provide such progress theorems for each instruction in the program.

In other words, a VC’s validity says that the corresponding instruction has a defined execution in the s86 operational semantics.

Page 9: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Symbolic Evaluator

We can define the verification condition generator (VCGen) via a symbolic evaluator

SE,,0,Post(i, , L)

The result of symbolic evaluation is a conjunction of VCs, so the overall progress theorem is then

Pre SE,,0,Post(i, , L)

LF signaturepostcondition

entry point

annotations

Page 10: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Soundness

For particular operational semantics (a safe x86 and a safe Alpha), we have presented theorems that say, essentially:

Thm: If Pre SE,,0,Post(i, , L), then execution of , given Pre and 0, and starting from entry point i, will always make progress (or return).

Page 11: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Getting from Concept to Implementation

In an actual implementation, it is also handy to have a bit more than just a VC generator.

Precise syntax for VCs.

Pre/post-conditions for each entry point expected by the host in any downloaded code.

Precisely specified logical system for proving the VCs.

Page 12: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety Policy Implementations

Safety policies are thus given in three parts:

A verification-condition generator (VCGen).

A specification of the pre & post conditions for all required procedures.

A specification of the inference rules for constructing valid proofs.

LF is used for the rule and pre/post specifications, C for the VCGen.

Page 13: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

C?!@$#@!

The use of C to define and implement the VCGen is, at best, expedient and at worst dubious.

However, since any code-inspection system must parse object files (not trivial!) and understand the instruction set, this seems to have practical benefits.

Clearly, a more formal approach would be desirable.

Page 14: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

ExampleJava Type-Safety Specification

Our largest example of a safety-policy specification is for the “SpecialJ” Java native-code compiler.

It contains about 140 inference rules.

Roughly speaking, these rules can be separated into 5 classes.

Page 15: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety PolicyRule Excerpts

/\ : pred -> pred -> pred.\/ : pred -> pred -> pred.=> : pred -> pred -> pred.all : (exp -> pred) -> pred.

pf : pred -> type.

truei : pf true.andi : {P:pred} {Q:pred} pf P -> pf Q -> pf (/\ P Q).andel : {P:pred} {Q:pred} pf (/\ P Q) -> pf P.ander : {P:pred} {Q:pred} pf (/\ P Q) -> pf Q.

1. Standard syntax and rules for first-order logic.

Type of valid proofs, indexed by predicate.

Syntax of predicates.

Inference rules.

Page 16: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

= : exp -> exp -> pred.<> : exp -> exp -> pred.

eq_le : {E:exp} {E':exp} pf (csubeq E E') -> pf (csuble E E').

moddist+: {E:exp} {E':exp} {D:exp} pf (= (mod (+ E E') D) (mod (+ (mod E D) E') D)).

=sym : {E:exp} {E':exp} pf (= E E') -> pf (= E' E).<>sym : {E:exp} {E':exp} pf (<> E E') -> pf (<> E' E).

=tr : {E:exp} {E':exp} {E'':exp} pf (= E E') -> pf (= E' E'') -> pf (= E E'').

Safety PolicyRule Excerpts

2. Syntax and rules for arithmetic and equality.

“csuble” means in the x86 machine.

Page 17: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety PolicyRule Excerpts

jint : exp.jfloat : exp.jarray : exp -> exp.jinstof : exp -> exp.

of : exp -> exp -> pred.

faddf : {E:exp} {E':exp} pf (of E jfloat) -> pf (of E' jfloat) -> pf (of (fadd E E') jfloat).

ext : {E:exp} {C:exp} {D:exp} pf (jextends C D) -> pf (of E (jinstof C)) -> pf (of E (jinstof D)).

3. Syntax and rules for the Java type system.

Page 18: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety PolicySample Rules

aidxi : {I:exp} {LEN:exp} {SIZE:exp} pf (below I LEN) -> pf (arridx (add (imul I SIZE) 8) SIZE LEN).

wrArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} {E:exp} pf (of A (jarray T)) ->

pf (of M mem) -> pf (nonnull A) -> pf (size T 4) ->

pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (of E T) -> pf (safewr4 (add A OFF) E).

4. Rules describing the layout of data structures.

This “sel4” means the result of reading 4 bytes from heap M at address A+4.

Page 19: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Safety PolicySample Rules

nlt0_0 : pf (csubnlt 0 0).nlt1_0 : pf (csubnlt 1 0).nlt2_0 : pf (csubnlt 2 0).nlt3_0 : pf (csubnlt 3 0).nlt4_0 : pf (csubnlt 4 0).

5. Quick hacks.

Sometimes “unclean” things are put into the specification...

Page 20: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

How Do We Know That It’s Right?

Page 21: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Homework Exercise

4. Some of the proof rules are specific to the type system of the source language (Java), even though we are actually verifying x86 machine code.

Why has this been done?

Page 22: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

A Note about Memory

We define a type for valid heap memory states:

mem : exp

and operators for reading and writing heap memory:

(sel M A)

(upd M A E)

Page 23: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The VCGen, via Detailed Examples

Page 24: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

High-Level Architecture

Explanation

CodeVerificationconditiongenerator

Checker

Safetypolicy

Agent

Host

Page 25: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example: Source Code

public class Bcopy { public static void bcopy(int[] src,

int[] dst) { int l = src.length; int i = 0;

for(i=0; i<l; i++) { dst[i] = src[i]; } }}

Page 26: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example: Target Code

ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3).text.align 4.globl _bcopy__6arrays5BcopyAIAI_bcopy__6arrays5BcopyAIAI:

cmpl $0, 4(%esp)je L6movl 4(%esp), %ebxmovl 4(%ebx), %ecxtestl %ecx, %ecxjg L22ret

L22:xorl %edx, %edxcmpl $0, 8(%esp)je L6movl 8(%esp), %eaxmovl 4(%eax), %esi

L7:ANN_LOOP(INV = {

(csubneq ebx 0),(csubneq eax 0),(csubb edx ecx),(of rm mem)},

MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM))cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edxjl L7ret

L13:call __Jv_ThrowBadArrayIndex

ANN_UNREACHABLEnop

L6:call __Jv_ThrowNullPointer

ANN_UNREACHABLEnop

Page 27: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Cut Points

Each loop entry must be annotated as a cut point.

VCGen requires this so that checking can be performed in a single scan of the code.

As a convenience, the modified registers are also declared in the cut annotations.

Page 28: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example: Target Code

ANN_LOCALS(_bcopy__6arrays5BcopyAIAI, 3).text.align 4.globl _bcopy__6arrays5BcopyAIAI_bcopy__6arrays5BcopyAIAI:

cmpl $0, 4(%esp)je L6movl 4(%esp), %ebxmovl 4(%ebx), %ecxtestl %ecx, %ecxjg L22ret

L22:xorl %edx, %edxcmpl $0, 8(%esp)je L6movl 8(%esp), %eaxmovl 4(%eax), %esi

L7:ANN_LOOP(INV = {

(csubneq ebx 0),(csubneq eax 0),(csubb edx ecx),(of rm mem)},

MODREG = (EDI,EDX,EFLAGS,FFLAGS,RM))cmpl %esi, %edxjae L13movl 8(%ebx, %edx, 4), %edimovl %edi, 8(%eax, %edx, 4)incl %edxcmpl %ecx, %edxjl L7ret

L13:call __Jv_ThrowBadArrayIndex

ANN_UNREACHABLEnop

L6:call __Jv_ThrowNullPointer

ANN_UNREACHABLEnop

VCGen requires annotations in order to simplify the process.

Page 29: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Example: Source Code

public class Bcopy { public static void bcopy(int[] src,

int[] dst) { int l = src.length; int i = 0;

for(i=0; i<l; i++) { dst[i] = src[i]; } }}

Page 30: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The VCGen Process (1)_bcopy__6arrays5BcopyAIAI:

cmpl $0, src je L6 movl src, %ebx movl 4(%ebx), %ecx testl %ecx, %ecx jg L22 retL22:

xorl %edx, %edx cmpl $0, dst je L6 movl dst, %eax movl 4(%eax), %esiL7: ANN_LOOP(INV = …

A0 = (type src_1 (jarray jint))A1 = (type dst_1 (jarray jint))A2 = (type rm_1 mem)A3 = (csubneq src_1 0)ebx := src_1ecx := (sel4 rm_1 (add src_1 4))

A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0)

edx := 0

A5 = (csubneq dst_1 0)eax := dst_1esi := (sel4 rm_1 (add dst_1 4))

Page 31: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The VCGen Process (2)

L7: ANN_LOOP(INV = { (csubneq ebx 0), (csubneq eax 0), (csubb edx ecx), (of rm mem)}, MODREG = (EDI, EDX, EFLAGS,FFLAGS,RM)) cmpl %esi, %edx jae L13

movl 8(%ebx,%edx,4), %edi

movl %edi, 8(%eax,%edx,4) …

A3A5A6 = (csubb 0 (sel4 rm_1 (add src_1 4)))

edi := edi_1edx := edx_1rm := rm_2

A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4))!!Verify!! (saferd4 (add src_1 (add (imul edx_1 4) 8)))

Page 32: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Checker (1)

The checker is asked to verify that(saferd4 (add src_1 (add (imul edx_1 4) 8)))

under assumptionsA0 = (type src_1 (jarray jint))A1 = (type dst_1 (jarray jint))A2 = (type rm_1 mem)A3 = (csubneq src_1 0)A4 = (csubgt (sel4 rm_1 (add src_1 4)) 0)A5 = (csubneq dst_1 0)A6 = (csubb 0 (sel4 rm_1 (add src_1 4)))A7 = (csubb edx_1 (sel4 rm_2 (add dst_1 4))

The checker looks in the PCC for a proof of this VC.

Page 33: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Checker (2)

In addition to the assumptions, the proof may use axioms and proof rules defined by the host, such as

szint : pf (size jint 4)

rdArray4: {M:exp} {A:exp} {T:exp} {OFF:exp} pf (type A (jarray T)) -> pf (type M mem) -> pf (nonnull A) -> pf (size T 4) -> pf (arridx OFF 4 (sel4 M (add A 4))) -> pf (saferd4 (add A OFF)).

Page 34: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Checker (3)

A proof for

(saferd4 (add src_1 (add (imul edx_1 4) 8)))

in the Java specification looks like this (excerpt):

(rdArray4 A0 A2 (sub0chk A3) szint (aidxi 4 (below1 A7)))

This proof can be easily validated via LF type checking.

Page 35: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGenSummary

VCGen is a symbolic evaluator for the object language.

It essentially implements a reference interpreter, except:

it uses symbolic values in order to model all possible executions, and

instead of performing run-time checks, it asks a Checker to verify the safety of “dangerous” instructions.

Page 36: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Homework Exercises

5. When a loop invariant is encountered for the second time, what actions must the VCGen perform?

6. In principle, how big can a VC get, relative to the size of the program?

7. What kind of program might make a VC get very large?

Page 37: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Another Example[by George Necula]

void fir (int *data, int dlen, int *filter, int flen) { int i, j;

for (i=0; i<=dlen-flen; i++) { int s = 0;

for (j=0; j<flen; j++) s += filter[j] * data[i+j];

data[i] = s; }}

Skip this example

Page 38: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Compiled Example

ri = 0sub t1 = rdl, rfl

L0: CUT(ri,rj,rs,t2,t3,t4,rm)le t2 = ri, t1jeq t2, L3rs = 0rj = 0

L1: CUT(rj,rs,t2,t3,t4)lt t2 = rj, rfljeq t2, L2ult t2 = rj, rfljeq t2, Labortld t3 = [rf + 4*rj]add t2 = ri, rj

ult t4 = t2, rdljeq t4, Labortld t2 = [rd + 4*t2]mul t2 = t3, t2add rs = rs, t2add rj = rj, 1jmp L1

L2: ult t2 = ri, rdljeq t2, Labortst [rd + 4*ri] = rsadd ri = ri, 1jmp L0

L3: retLabort: call abort

/* rd=data, rdl=dlen, rf=filter, rfl=flen */

Page 39: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Safety Policy

The safety policy defines verification conditions of the form:

true, E = E saferd(M, E), safewr(M, E, E) array(EA, ES, EL), vector(EA, ES, EL) Prefir = array(rd,4,rdl),

vector(rf,4,rfl) Postfir = true

Page 40: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGen Example

ri = 0sub t1 = rdl, rfl

L0: CUT(ri,rj,rs,t2,t3,t4,rm)

le t2 = ri, t1jeq t2, L3…

L3: ret

Assume precondition: array(cd,4,cdl) vector(cf,4,cfl)

Set ri = 0

Set t1 = sub(cdl,cfl)

Set rd=cd; rdl=cdl; rf=cf; rfl=cfl; rm=cm

Set ri=ci; rj=cj; rs=cs; t2=c2; t3=c3; t4=c4; rm=cm’

Set t2 = le(ci, sub(cdl,cfl))Assume not(le(ci, sub(cdl,cfl)))

Check postcondition;

Check rd,rdl,rf,rfl have initial values

Page 41: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGen Example

ri = 0sub t1 = rdl, rfl

L0: CUT(ri,rj,rs,t2,t3,t4,rm)

le t2 = ri, t1jeq t2, L3rs = 0rj = 0

L1: CUT(rj,rs,t2,t3,t4)

lt t2 = rj, rfljeq t2, L2…

L2: ult t2 = ri, rdljeq t2, Labortst [rd + 4*ri] = rs

Set ri = 0

Set t1 = sub(cdl,cfl)Set ri=ci; rj=cj; rs=cs; t2=c2 t3=c3; t4=c4; rm=cm’

Set t2 = le(ci, sub(cdl,cfl))Assume le(ci, sub(cdl,cfl))Set rs = 0Set rj = 0Set rj=cj’; rs=cs’; t2=c2’; t3=c3’; t4=c4’

Set t2 = lt(cj’, cfl)Assume not(lt(cj’, cfl))

Set t2 = ult(ci, cdl)Assume ult(ci, cdl)Check safewr(cm’, add(cd,mul(4,ci)),cs’)

Page 42: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

More on the Safety Policy

The safety policy is defined as an LF signature.

rdarray : saferd(M,add(A,mul(S,I))) <- array(A,S,L), ult(I,L).

rdvector : saferd(M,add(A,mul(S,I))) <- vector(A,S,L), ult(I,L).

wrarray : safewr(M,add(A,mul(S,I)),V) <- array(A,S,L), ult(I,L).

Page 43: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Checker

When the Checker is invoked on safewr(cm’, add(cd,mul(4,ci)), cs’)

There are assumptions: assume0 : ult(ci,cdl). assume1 : not(lt(cj’,cfl)). assume2 : le(ci, sub(cdl,cfl)). assume3 : vector(cf,4,cfl). assume4 : array(cd,4,cdl).

Page 44: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

The Checker, cont’d

The VC safewr(cm’, add(cd,mul(4,ci)), cs’)

can be verified by using the rule wrarray : safewr(M,add(A,mul(S,I)),V) <- array(A,S,L), ult(I,L).

and assumptions assume0 : ult(ci,cdl). assume4 : array(cd,4,cdl).

Page 45: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Proof Representation

A simple (but somewhat naïve) representation of the proof is simply the sequence of proof rules:

wrarray, assume4, assume0

We shall see that better representations are possible.

LF typechecking is sufficient for proofchecking.

Page 46: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Optimized Code

The previous example was somewhat simplified.

More realistic code is optimized, usually based on inferences about integer values.

Such optimizations require that arithmetic invariants be placed in the cut points.

Page 47: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

Optimized Example

ri = 0sub t1 = rdl, rfl

L0: CUT(ri>0,{ri,rj,…})le t2 = ri, t1jeq t2, L3rs = 0rj = 0

L1: CUT(rj>0,{rj,rs,…})lt t2 = rj, rfljeq t2, L2ld t3 = [rf + 4*rj]add t2 = ri, rj

ld t2 = [rd + 4*t2]mul t2 = t3, t2add rs = rs, t2add rj = rj, 1jmp L1

L2: st [rd + 4*ri] = rsadd ri = ri, 1jmp L0

L3: ret

/* rd=data, rdl=dlen, rf=filter, rfl=flen */

Page 48: Mobility, Security, and Proof-Carrying Code  Peter Lee Carnegie Mellon University

VCGen Example

ri = 0sub t1 = rdl, rfl

L0: CUT(ri>0, {ri,rj,rs,t2,t3,t4,rm}

le t2 = ri, t1jeq t2, L3rs = 0rj = 0

Set ri = 0

Set t1 = sub(cdl,cfl)Set ri=ci; rj=cj; rs=cs; t2=c2 t3=c3; t4=c4; rm=cm’

Set t2 = le(ci, sub(cdl,cfl))Assume le(ci, sub(cdl,cfl))

Assume >(ci,0)