[ft-11][suhorng] “poor man's” undergraduate compilers

20
Poor Man's Undergraduate Compilers suhorng This slide: https://github.com/suhorng/ss/tree/master/ft11 1 / 20

Upload: functional-thursday

Post on 11-May-2015

130 views

Category:

Education


0 download

DESCRIPTION

“Poor Man's” Undergraduate Compilers --by suhorng --on Functional Thursday Meetup 11

TRANSCRIPT

Page 1: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Poor Man'sUndergraduate Compilers

suhorng

‘‘ "

This slide: https://github.com/suhorng/ss/tree/master/ft11

1 / 20

Page 2: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Poor Man's Undergraduate Compilers

What compiler?

: https://github.com/suhorng/ss/

A minimal functional language compiler

:https://github.com/suhorng/compiler13hw/

Compiler homework: Compiling C-- to MIPS,

written in Haskell

How poor?

It is slow

It generates slow codes

‘‘ "

ss

‘‘ "

compiler13hw

2 / 20

Page 3: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

originally stands for small Scheme

Written in Scheme, compiling a minimal functionallanguage to x86-32 assembly

No data types, no optimizations, no need for parsers

Only 102 commits!

suhorng@SHHY-ASPIRE2920 /d/code/test/ss (master)$ wc *.ss *.s 78 363 3645 closure.ss 192 762 9137 code-gen.ss 86 465 3588 cps.ss 11 24 172 issac.ss 168 811 5584 match-case-simple.ss 30 115 991 prelude.ss 313 1495 14334 reg-alloc.ss 130 477 4942 seq-ir-gen.ss 104 304 3090 ss.ss 166 863 8500 type-infer.ss 85 222 2095 sscrt.s 1363 5901 56078 total

ss

ss

3 / 20

Page 4: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

: A Language

Simply typed λ-calculus with constants and a fixed-pointoperator. Strict evaluation.

Terms

Types

ss

e :: = c | x | (+ ) | (× )e1 e2 e1 e2

| (lambda ( …) e)x1 x2

| (ifz con th el)

| (fix e)

| ( …)e1 e2

t :: = N | ()

| →t1 t2

| (× …)t1 t2

4 / 20

Page 5: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Passestype inference

CPS transformation

closure conversion

transform into low-level IR

register allocation

machine code generation5 / 20

Page 6: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Type Inference

Interpreter,

[() ̀(() ,(prim-type 'Unit))]

[,x (guard (var? x)) ̀(,x ,(assq x mono-cxt))]

[(fix ,e) (let [(a (fresh-var)) (built-e (build-type! e mono-cxt))] (unify! (expr->type built-e) (fun-type a a)) ̀((fix ,built-e) ,a))]

[(lambda (,[xs ..]) ,e) (let* [(obj-as (map (lambda (_) (fresh-var)) xs)) (built-e (build-type! e (append (map cons xs obj-as) mono-cxt))) (obj-b (expr->type built-e)) (obj-xs (map list xs obj-as))] ̀((lambda (,@obj-xs) ,built-e) ,(fun-type (tuple-type obj-as) obj-b)))]

[(,e1 ,[es ..]) (let* [(built-e1 (build-type! e1 mono-cxt)) (built-es (map (lambda (e) (build-type! e mono-cxt)) es)) (obj-a2b (expr->type built-e1)) (obj-as (map expr->type built-es)) (obj-b (fresh-var))] (unify! obj-a2b (fun-type (tuple-type obj-as) obj-b)) ̀((,built-e1 ,@built-es) ,obj-b))]

6 / 20

Page 7: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

CPS Transformation

interpreter,

(define cpsk ;; cpsk :: eT -> {(eT -> eC) | kC} -> eC (lambda (expr k) (match expr [(,c ,t) (guard (prim-const? c)) (apply-cont k ̀(,c ,t))] [(,x ,t) (guard (var? x)) (apply-cont k ̀(,x ,t))] [((lambda (,[xs ..]) ,e) ,t) (let [(k0 (fresh-var "&"))] (apply-cont k ̀((lambda ,xs ,k0 ,(cpsk e k0)) ,t)))] [((fix ,e) ,t) (cpsk e (lambda (v) ̀((fix ,(mark-type v e) ,(place-cont k)) ,t)))] ...

(define apply-cont (lambda (k x) (cond [(procedure? k) (k x)] [else ̀(cont-ap ,k ,x)])))(define place-cont (lambda (k) (cond [(procedure? k) (let [(t (fresh-var "%"))] ̀(lambda (,t) ,(k t)))] [else k])))

7 / 20

Page 8: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Closure Conversion

interpreter...

Compute free variables

(match expr [(,c ,t) (guard (prim-const? c)) '( () )] [(,x ,t) (guard (var? x)) ̀( ((,x ,t)) )] [((lambda (,[xs ..]) ,k ,e) ,t) ; lambda abstraction (let [(var/e (uncover-free-vars e))] ̀(,(remove-assoc* (map car xs) (car var/e)) ,var/e))]

Closure conversion

[((,x ,t) __) (guard (var? x)) (cond [(memq x bound-vars) ̀(,x ,t)] [else ̀((this-ref ,x) ,t)])][(((lambda (,[xs ..]) ,k ,e) ,t) (,free-vars ,var/e));lambda abstraction (let [(fv-ref (map (lambda (x) (closure-convert x ̀((x)) bound-vars)) free-vars))] ̀((closure ,fv-ref ((lambda ,xs ,k ,(closure-convert e var/e (map car xs))) ,t)) ,t))] 8 / 20

Page 9: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Now the code looks like

((lambda (a) ((lambda (b) a) 5)) (+ 1 2))

((closure () ((lambda ((argv Unit)) &1 ((+ (1 Int) (2 Int) (lambda (%2) ((((closure () ((lambda ((a Int)) &3 ((((closure ((a Int)) ((lambda ((b Int)) &4 (cont-ap &4 ((this-ref a) Int))) (Int -> Int))) (Int -> Int)) (5 Int)) &3 Int)) (Int -> Int))) (Int -> Int)) (%2 Int)) &1 Int))) Int)) (Unit -> Int))) (Unit -> Int))

9 / 20

Page 10: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Now the code looks like

((lambda (a) ((lambda (b) a) 5)) (+ 1 2))

((closure () ((lambda ((argv Unit)) &1 ((+ (1 Int) (2 Int) (lambda (%2) ((((closure () | ((lambda ((a Int)) &3 | ((((closure ((a Int)) | ((lambda ((b Int)) &4 | (cont-ap &4 ((this-ref a) Int)))))) | (5 Int)) | &3))))) (%2 Int)) &1))))))))

10 / 20

Page 11: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Low-level IR

Flatten the continuations, lift functions to top level

(((closure ::fn1 ()) (Unit -> Int))

((lambda ::fn1 (Unit -> Int) () ((argv Unit)) ((%2 : Int <- (1 + 2)) (tail-call (function ::fn2) %2)))

(lambda ::fn2 (Int -> Int) () ((a Int)) ((%f1 : (Int -> Int) <- (closure ::fn3 (a))) (tail-call %f1 5)))

(lambda ::fn3 (Int -> Int) ((a Int)) ((b Int)) ((ret (this-ref a))))))

Here comes a machine!

‘‘ "

11 / 20

Page 12: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Register Allocation

A MESS :-D

12 / 20

Page 13: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Machine Code Generation

From pseudo-assembly...

(lambda ::fn1 (Unit -> Int) () ((argv Unit)) ((argv Unit)) 1 () () ((make-call-stack 1) (eax <- (const 3)) ((arg 0) <- eax) (edi <- (function ::fn2)) (tail-call 1 (function ::fn2))))(lambda ::fn2 (Int -> Int) () ((a Int)) ((a Int) (%f1 Int -> Int)) 2 (ebx) () ((make-call-stack 2) (eax <- (function ::fn3)) ((arg 0) <- eax) (eax <- (const 1)) ((arg 1) <- eax) (call-prim 2 make_closure) (edx <- (formal a)) ((closure eax 0) <- edx) (make-call-stack 1) (ebx <- eax) (eax <- (const 5)) ((arg 0) <- eax) (edi <- ebx) (tail-call 1 ebx)))(lambda ::fn3 (Int -> Int) ((a Int)) ((b Int)) ((a Int) (b Int)) 0 () () ((eax <- (this-ref a)) return)))

13 / 20

Page 14: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Machine Code Generation

To concrete machine code

(lambda _ss_function_fn1 (Unit -> Int) () ((argv Unit)) ((argv Unit)) () ((sub esp 4) (mov eax 3) (mov (* (esp + 0)) eax) (mov edi _ss_function_fn2) (lea edx (* (esp + 4))) (mov ecx (* (esp + 4))) (mov eax (* (esp + 0))) (mov (* (edx + 4)) eax) (mov (* (edx)) ecx) (mov esp edx) (jmp _ss_function_fn2_code)))

(lambda _ss_function_fn2 (Int -> Int) () ((a Int)) ((a Int) (%f1 Int -> Int)) () ((push ebp) (mov ebp esp) (sub esp 4) (mov (* (ebp - 4)) ebx) (sub esp 8) (mov eax _ss_function_fn3) (mov (* (esp + 0)) eax) (mov eax 1) (mov (* (esp + 4)) eax) (call _ss_prim_make_closure) (mov edx (* (ebp + 8))) (mov (* (eax + 4)) edx) (sub esp 4) (mov ebx eax) (mov eax 5) (mov (* (esp + 0)) eax) (mov edi ebx) (mov ebx (* (ebp - 4))) (mov edx (* (ebp))) (mov ecx (* (ebp + 4))) (lea ebp (* (esp + 12))) (mov eax (* (esp + 0))) (mov (* (ebp + 4)) eax) (mov (* (ebp)) ecx) (mov esp ebp) (mov ebp edx) (jmp (* (edi)))))

(lambda _ss_function_fn3 (Int -> Int) ((a Int)) ((b Int)) ((a Int) (b Int)) () ((mov eax (* (edi + 4))) (ret 4))))

14 / 20

Page 15: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Tail Calls

Loops

((fix (lambda (loop) (lambda (n sum) (ifz n sum (loop (+ n -1) (+ sum n)))))) 5 0)

Compare:

int sum = ???;for (int i = n; i != 0; i = i-1) sum = sum + n;

Naively implementing tail calls:

Place function call arguments as usual

Move arguments

Adjust frame pointer; jump.15 / 20

Page 16: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Tail Calls

;#######################################; _ss_function_fn3: ((* Int Int) -> Int); parameters:; ((n Int) (sum Int)); free variables:; ((loop ((* Int Int) -> Int)));#######################################_ss_function_fn3_code: ; Note: this function doesn't have a frame cmp dword [esp + 4], 0 jne .L1 mov eax, [esp + 8] ret 8 ; terminating loop.L1: mov eax, [esp + 8] ; eax := sum add eax, [esp + 4] ; eax (sum') += n mov edx, [esp + 4] add edx, -1 ; edx (n') := n - 1 sub esp, 8 ; place arguments as usual mov [esp], edx ; | sum' | esp+4 mov [esp + 4], eax ; | n' | esp mov edi, [edi + 4] ; load closure pointer lea edx, [esp + 8] mov ecx, [esp + 8] ; move new arguments up mov eax, [esp + 4] mov [edx + 8], eax ; sum' overrides sum, mov eax, [esp] mov [edx + 4], eax ; so does n'! mov [edx], ecx mov esp, edx jmp [edi]

16 / 20

Page 17: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

A compiler for a small subset of C, implemented inHaskell/suhorng & kevin4314

(nothing special)

Parsing is done using Happy

cf. Happy MonadFix, Easy -pass compiler/CindyLinz

compiler13hw

n

17 / 20

Page 18: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Yet the register allocation is still a mess :-D

do {- read input & parsing -} let Right parsedAST = Parser.parse input

{- semantic check -} let compareCompileError ce1 ce2 = compare (errLine ce1) (errLine ce2) (ast, ces) <- runWriterT $ censor (sortBy compareCompileError) $ do foldedAST <- Const.constFolding parsedAST typeInlinedAST <- Desugar.tyDesugar foldedAST let decayedAST = Desugar.fnArrDesugar typeInlinedAST symbolAST <- SymTable.buildSymTable decayedAST typedAST <- TypeCheck.typeCheck symbolAST return $ NormalizeAST.normalize typedAST when (not $ null ces) $ mapM_ (putStrLn . show) ces >> exit1

{- code generation -} let adjustedAST = SethiUllman.seull ast llir = LLIRTrans.llirTrans adjustedAST inlinedLLIR = EmptyBlockElim.elim llir llirFuncs = LLIR.progFuncs inlinedLLIR llirGlobl = LLIR.progVars inlinedLLIR llirRegs = LLIR.progRegs inlinedLLIR

let mips = MIPSTrans.transProg $ inlinedLLIR simpMips = BlockOrder.jumpElim . BlockOrder.blockOrder $ mips print simpMips

compiler13hw

18 / 20

Page 19: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

-- all are interpreters

alphaConvAST s@(S.Block _ _) = runLocal $ alphaConvBlock' salphaConvAST (S.Expr ty line rand rators) = S.Expr ty line rand <$> mapM alphaConvAST ratorsalphaConvAST (S.ImplicitCast ty' ty e) = S.ImplicitCast ty' ty <$> alphaConvAST e

buildMStmt (P.While line whcond whcode) = do whcond' <- buildMStmts whcond whcode' <- runLocal (buildMStmt whcode) return $ S.While line whcond' whcode'buildMStmt (P.Identifier line name) = do currScope <- get upperScope <- ask let ty = fmap S.varType $ lookup name currScope <|> lookup name upperScope case ty of Just S.TTypeSyn -> tell [errorAt line $ "Unexpected type synonym '" ++ name ++ "'"] Nothing -> tell [errorAt line $ "Undeclared identifier '" ++ name ++ "'"] otherwise -> return () return $ S.Identifier (error "buildMStmt:Identifier") line name

tyCheckAST (S.Expr _ line rator [rand1, rand2]) | rator ̀elem̀ logicOps = do rand1' <- tyCheckAST rand1 rand2' <- tyCheckAST rand2 let (t1, t2) = (S.getType rand1', S.getType rand2') when ((not $ tyIsScalarType t1) || (not $ tyIsScalarType t2)) $ tell [errorAt line $ "'" ++ show rator ++ "' is applied to operands of non-scalar type" return $ S.Expr S.TInt line rator [rand1', rand2']

compiler13hw

19 / 20

Page 20: [FT-11][suhorng] “Poor Man's” Undergraduate Compilers

Writing Compiler Using FunctionalLanguages

Interpreters love you!

Simple AST (Pointers? new, delete? Visitor pattern?

Wat?)

I don't know how to deal with similar ASTs

Don't know how to express constraints on ASTs (e.g.only allow a subset of operators)

Haven't thought of good ways to implement registerallocation

20 / 20