Typed Assembly Languages
COS 441, Fall 2004
Frances Spalding
Based on slides from Dave Walker and Greg Morrisett
12/10/2004 COS441 - Typed Assembly Languages
2
Type Safety
Progress: if a state is well-typed then it is not stuck
If ` e: then either e is a value or
9 e’ such that e !* e’
Preservation: each step in evaluation preserves typing
If ` e: and e !* e’ then ` e’:
12/10/2004 COS441 - Typed Assembly Languages
3
High Level vs. Low Level
Very large, complex compiler
that’s almost certainly full of all sorts of insidious
bugs
Type Safe Source Code
Assembly Code
If the source language has been proved to be type safe, what do we know about the safety of the code we actually run?
Nothing!
What if the compiler produced a typed assembly language?
12/10/2004 COS441 - Typed Assembly Languages
4
Proof Carrying Code
The code producer provides a proof of safety along with the executable
The code consumer checks that proof before executing the code
A typing derivation of the low-level code can be a safety proof
Source Program
Native SafetyCode Proof
Compilation &
Certification
Proof Validation
CPU
Code Producer
Code Consumer
12/10/2004 COS441 - Typed Assembly Languages
5
High-level code for factorial
12/10/2004 COS441 - Typed Assembly Languages
6
Assembly code for factorial
12/10/2004 COS441 - Typed Assembly Languages
7
TAL0 Syntax Registers: r1, r2, … Labels: L, … Integers: n Operands: v ::= r | n | L Arithmetic Ops: aop ::= add | sub | mul | … Branch Ops: bop ::= beq | bgt | … Instructions: ::= aop rd, rs, v | bop r, v | mov
r, v Blocks: B ::= jmp v | ; B
12/10/2004 COS441 - Typed Assembly Languages
8
TAL0 Dynamic Semantics
Machine State = (H,R,B) H maps labels to basic blocks R maps registers to values (integers and labels) B is a basic block (corresponding to the current
program counter) Model evaluation as a transition function
mapping machine states to machine states:
12/10/2004 COS441 - Typed Assembly Languages
9
TAL0 Dynamic Semantics
12/10/2004 COS441 - Typed Assembly Languages
10
TAL0 Static Semantics
::= int | ! {} ::= {r1:1, r2:2, …}
Code labels have type {r1:1, r2:2, …} ! {} To jump to code with this type, r1 must
contain a value of type , etc Notice that functions never return – the code
block always ends with a jump to another label
12/10/2004 COS441 - Typed Assembly Languages
11
Example Program with Types
12/10/2004 COS441 - Typed Assembly Languages
12
Type Checking Basics
We need to keep track of: The types of the registers at each point in the
code The types of the labels on the code
Heap Types: will map labels to label types Register Types: will map registers to types
12/10/2004 COS441 - Typed Assembly Languages
13
Basic Typing Judgments
; ` n: int ; ` r: (r) ; ` L : (L)
12/10/2004 COS441 - Typed Assembly Languages
14
Subtyping on Register Files
Our program will never crash if the register file contains more values than necessary to satisfy some typing precondition
Width subtyping:
{r1:1,…,ri-1:i-1,ri:i} <: {r1:1,…,ri-1:i-1}
12/10/2004 COS441 - Typed Assembly Languages
15
Subtyping on Code Types
Like ordinary function types, code types are contravariant
’ <: ! {} <: ’ ! {}
12/10/2004 COS441 - Typed Assembly Languages
16
Typing Instructions
The typing judgment for instructions describes the registers on input to the function and the registers on output
` : 1 2
is invariant. The types of heap objects cannot change as the program executes
12/10/2004 COS441 - Typed Assembly Languages
17
Typing Instructions
` rs : int ; ` v : int
` add rd, rs, v : ! [rd := int]
` r : int ; ` v : ! {}` ble r, v : !
` v : ` mov rd, v : ! [rd := ]
12/10/2004 COS441 - Typed Assembly Languages
18
Basic Block Typing
All basic blocks end with jump instructions
; ` v : ! {} ` jmp v : ! {}
Instruction Sequences
` : 1 ! 2 ` B : 2 ! {}
` ;B : 1 ! {}
12/10/2004 COS441 - Typed Assembly Languages
19
Heap and Register File Typing Heap Typing
Dom(H)=Dom() 8 L 2 Dom(H). ` H(L):(L)
` H :
Register File Typing
8 r 2 Dom(). ; ` R(r) : (r)
` R :
12/10/2004 COS441 - Typed Assembly Languages
20
Machine Typing
Main Typing Judgment
` H : ` R : ` B : ! {} ` (H,R,B)
12/10/2004 COS441 - Typed Assembly Languages
21
Type Safety for TAL0
Progress:
If ` 1 then 9 2 such that 1 2
Preservation
If ` 1 and 1 2 then ` 2
12/10/2004 COS441 - Typed Assembly Languages
22
Limitations of TAL0
What about situations where we don’t care which specific type is in a register? Eg. A function that swaps the contents of two
registers
We can use polymorphic types to express this.
12/10/2004 COS441 - Typed Assembly Languages
23
TAL1 : Polymorphism
::= … |
8 , . { r1:, r2:, r31: {r1:, r2:} ! {}} ! {} Describes a function that swaps the values in r1
and r2 for values of any two types
To jump to polymorphic functions, first explicitly instantiate type variables: v[]
12/10/2004 COS441 - Typed Assembly Languages
24
Polymorphism Example
12/10/2004 COS441 - Typed Assembly Languages
25
Callee-Saved Registers
We can use polymorphism to implement callee-saved registers We’d like to be able to enforce that a function
doesn’t change certain registers. In other words, when it returns, certain registers have the same values as when it started
8 . { r2: int, …, r5:, …., r31: {r1: int, …, r5:, …} ! {} } ! {}
Why does this enforce the invariant we want?
12/10/2004 COS441 - Typed Assembly Languages
26
TAL1 Semantics
Changes to the static semantics to support polymorphism Add the judgment ` ok Modify the other judgments to take this into
account (just like homework 8!) Changes to the dynamic semantics
Add rule to do instantiation
(H, R, jmp v[]) (H, R, B[/])
12/10/2004 COS441 - Typed Assembly Languages
27
Full TAL Syntax
12/10/2004 COS441 - Typed Assembly Languages
28
Limitations of TAL
Almost every compiler uses a stack As storage space for local variables and other
temporaries when we run out of space in registers To keep track of control flow
Control Links Activation Links
12/10/2004 COS441 - Typed Assembly Languages
29
STAL: Add a Stack
Machine States: (H, R, S, B) Stack S ::= nil | v :: S A designated register sp points to the top of the
stack
::= salloc n | sfree n | sld rd, n | sst rs, n push v ´ salloc 1; sst v, 1 pop v ´ sld r, 1; sfree 1
12/10/2004 COS441 - Typed Assembly Languages
30
Factorial Example
12/10/2004 COS441 - Typed Assembly Languages
31
Extensions for STAL
Stack types ::= nil | :: | nil represents the empty stack :: represents a stack v::S where v: and S: is a stack variable which we can use to
describe an unknown stack tail Use polymorphism
8 . {sp: int::, r1: int} ! {}
12/10/2004 COS441 - Typed Assembly Languages
32
Stack Instruction Typing
Stack Allocation
; ` salloc n : [sp := ]![sp := ?::…::?::]
Stack Free
; ` sfree : [sp := 1::…::n::]![sp := ]
12/10/2004 COS441 - Typed Assembly Languages
33
Stack Instruction Typing
Stack load
(sp) = 1 :: … :: n ::
; ` sld r, n : ! [r := n]
Stack store
; ; ` v : (sp) = 1 :: … :: n ::
; ` sst v, n : ! [sp := 1 :: … :: :: ]
12/10/2004 COS441 - Typed Assembly Languages
34
STAL Features
Polymorphism and polymorphic recursion are crucial for encoding standard procedure call/return
We didn’t have to bake in the notion of procedure call/return. Basic jumps were good enough
Can combine features to encode up a variety of calling conventions Arguments on stack or in registers? Results on stack or in registers? Return address? Caller pops? Callee pops? Caller saves? Callee saves?
12/10/2004 COS441 - Typed Assembly Languages
35
STAL Syntax
12/10/2004 COS441 - Typed Assembly Languages
36
General Limitations of TAL/STAL More functionality can be added to TAL There are still some limitations
Must know the order of data allocated on the stack sp: 1 @ @ 2 @ ’ @ 3
Must distinguish between heap and stack pointers Can’t have a function that takes in two generic int
pointers and adds their contents
12/10/2004 COS441 - Typed Assembly Languages
37
For more information on TAL/STAL From System F to Typed Assembly Languag
e. G. Morrisett, D. Walker, K. Crary, and N. Glew.
Stack-Based Typed Assembly Language. G. Morrisett, K. Crary, N. Glew, D. Walker.
12/10/2004 COS441 - Typed Assembly Languages
38
What Makes Stack Typing So Tricky? In the heap, we use the abstraction that a
location never changes type The garbage collector handles the reuse
When modeling instructions like push/pop, we have to break this abstraction and use strong updates to change types
Aliasing can make this unsound if we’re not careful
12/10/2004 COS441 - Typed Assembly Languages
39
My Research
A different approach to typed assembly language
Use logic to describe what memory looks like before and after each instruction
(r1 ) int) (r2 ) S(ℓ)) (ℓ ) bool)
load [r2], r1
(r1 ) int) (r2 ) S(ℓ)) (ℓ ) int)
12/10/2004 COS441 - Typed Assembly Languages
40
Modeling Stack “Evolution”
Keep track of the different “versions” of each stack location and how they are related
We can do this using a pair of a tree and a pointer to a node in the tree
12/10/2004 COS441 - Typed Assembly Languages
41
TOP
k3 k4 k6
Sta
ck D
ep
th
Time
k2 k5
k1
ℓ2
ℓ0
ℓ1
ℓ3
…
12/10/2004 COS441 - Typed Assembly Languages
42
Tagged Locations
All locations are “tagged” with their version number (r1 ) S(k3. ℓ2)) is different from (r1 ) S(k6. ℓ2))
The tags act as “guards” on the location Before accessing a tagged location, you must
prove that it is “live” In other words, the tag must be on the path
between the root and Top
12/10/2004 COS441 - Typed Assembly Languages
43
For more information …
Certifying Compilation for a Language with Stack Allocation [Draft]
. L. Jia, F. Spalding, D. Walker, N. Glew.