toy to practical interpreter mosh intenals shibuya.lisp2009/02/28

48
Toy to practical Taro Minowa (Higepon) Shibuya.Lisp Tech Talk#2 February 28, 2009 Mosh internals

Upload: higepon-taro-minowa

Post on 29-Jun-2015

1.288 views

Category:

Technology


1 download

DESCRIPTION

Mosh R6RS Scheme interpreterFrom slow toy interpreter to fast practical Interpreter.

TRANSCRIPT

Page 1: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Toy to practical

Taro Minowa (Higepon)

Shibuya.Lisp Tech Talk#2February 28, 2009

Mosh internals

Page 2: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Introduce myself

Page 4: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

MonaOpen Source OS

MoshFast Scheme Interpreter

Outputzhttp://outputz.com/

Page 5: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Today’s presentation is

about…

Page 6: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

From toy to practical interpreter

Page 7: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Mosh

Page 8: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

R6RS Scheme InterpreterAs fast as Gauche and Ypsilon(I believe)

Many SRFIs, DBI (MySQL)

Regexp (Oniguruma)

Object system (Tiny CLOS)

Process Management

Foreign Function Interface

Page 9: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Developmentversion 0.0.7

2 comitters

Higepon

kokosabu

In the future

Use shell for Mona

Page 10: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Toy

Page 11: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

The MITOH 2006Scheme Shell for Mona

OS integrated R5RS Scheme

Implementation based on SICP

My first interpreter!

Basic tree-based interpreter

Written in “Pure C++”

Page 12: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Good pointsIt works

Almost covers R5RS

Bad pointsToo slow

fib(31) takes a few minitues !

Not for practical use

Page 13: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Why was it slow and bad?

Problems

Slow GC -

Scheme recursion uses native stack

-

Slow environment look up

-

Incomplete tail calloptimization

-

Slow arithmetic -

Few Optimization -

Page 14: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Slow arithmeticToo many heap allocations

So fib(31) causes ...

With slow GC

(+ 1 1)=> new Number(1 + 1)

Page 15: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Learned from toySlow interpreter is useless

Slow interpreter is not practical

Need more speed

Need better design

Page 16: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Tree-based→ VM

Page 17: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Read “The 3imp”Three implementation models for scheme

Kent Dyvbig (ChezScheme)

Bytecode VM

With sample code

http://mono.kmc.gr.jp/~yhara/w/?Reading3imp.pdf#l13

Page 18: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Choose Stack-based VMFast environment look up

Use display closure

Use virtual stack

tail call optimization

Page 19: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

3imp doesn’t have

Multiple values Values register

Global variables Global hash-table

Subr Hook on APPLY instruction

let LET_FRAME instruction

Optimization on compilation

Borrowed from Gauche

Page 20: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Second implementation Mini Mosh

Write in Scheme instead of C++Easy to make prototype

no need to parse, we have “read”

never SEGV (very important!)

We can use backend’s proceduresOP_CAR => car

Page 21: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

PrototypingRewrote about 50 times

VM: 1400 lines, Compiler 2400 lines

Page 22: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Hardest partDesigning stack layout

Wrong stack position

change stack layout => crash

some code works the other doesn’t

Bugs in compiler, VM or design?

A Pen and a notebook are more than friend.

Page 23: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Instruction example

VM

(+ 1 1)=> ‘(CONST 1 PUSH CONST 1 NUMBER_ADD)

[(NUMBER_ADD) (apply-native-2arg +)][(CONSTANT) (val1) (VM codes (skip 1) (next 1) fp c stack sp)]

half

Page 24: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Improvement in Mini Mosh

Problems of toy Solutions

Slow GC -

Scheme recursion uses native stack

Stack VM usesvirtual stack

Slow environment look up

First lookupusing display closure

Incomplete tail calloptimization

Tail call optimization

Slow arithmetic -

Few Optimization Borrowed from Gauche

Page 25: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Port prototypeto C++

Page 26: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

VMEasy to port

maps cond to switch/case

maps recusive call to loop

CompilerPainful

Page 27: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

compiler(Scheme)

Mini Mosh(Scheme)

compiler(Scheme)

LREF 0PUSHGREF ‘compileAPPLY...

compile

list of instructionscompiler.cpp

generate

read

Compile compiler with Mini MoshEmbed the instructions to C++ Mosh

Run on Gauche

Mosh(C++)

make(LREF, 0)make(PUSH)make(GREF, “compile”)

Page 28: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Mini Mosh(Scheme)

Mosh(C++)

compiler(Scheme)

share

Share the compiler written in SchemeEasy to debug

Easy to process intermediate code

Page 29: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

VM in C++Use Boehm GC

Much faster than toy

fib(31) takes only a few seconds

(+ 1 1) doesn’t need heap allocation

Tag bit based Object system

Use immediate value for Number

Page 30: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Improvement in C++ Mosh

Problems of toy Solutions

Slow GC Boehm GC

Scheme recursion uses native stack

Stack VM usesvirtual stack

Slow environment look up

First lookupusing display closure

Incomplete tail calloptimization

Tail call optimization

Slow arithmetic Tag bit Object System

Few Optimization Borrowed from Gauche

Page 31: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Become a practical fast interpreter?

Page 32: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Not yet.

Page 33: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

2 beard gurus(ひげのお兄さんたち)

Gauche & Ypsilon

Speed freakhttp://osdevj.g.hatena.ne.jp/osdevj/20060807/1154962935

CJava

GaucheCINTPerl

PythonRuby 1.8

0 75 150 225 300msec

Page 34: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

We need performance tuningChart

Profiler

Tuning

Fast startup

Many optimization techniques

Page 35: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

ChartMake a goal clear

Know what I’ve done is good or bad

Run benchmarks

make bench

Draw charts

every time

Page 36: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

ProfilerC++ profiler tells us little

It happens inside the run-loop

We need Scheme profiler

mosh -p

SIG_PROF

Page 37: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Fast start up is also importantJust running empty script takes ...

Page 38: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Fast start up is also importantJust running empty script takes ...

Perl

Ruby

Gauche

Python

Ypsilon

Mosh

0 20 40 60 80msec

Page 39: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Mosh startup 80 => 20msecDon’t read many files when starts up

Don’t allocate large memory

Don’t use too many static initializer

Embed the compiler with binary format

FASL(Fast Loading)

Page 40: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Optimizations

Page 41: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

CompilerBeta reduction

Procedure inlining

Constant folding

(+ 1 2) => 3

Peep hole

destination of jump is jump etc ..

Page 42: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

VMInstructions Unification

Shorter instructions are faster

PUSH + APPLY => PUSH_APPLY

Direct threaded code

switch/case => goto

GCC only

Page 43: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Compare on instruction levelGauche

(disasm ...)

Ypsilon

(debug-compile ...)

Page 44: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

More improvement in C++ Mosh

Problems of toy Solutions

Slow GC Boehm GC

Scheme recursion uses native stack

Stack VM usesvirtual stack

Slow environment look up

First lookupusing display closure

Incomplete tail calloptimization

Tail call optimization

Slow arithmetic tag bit Object System

Few Optimization many many optimizations

Page 45: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Finally

Page 46: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Mosh becomes practical!Practical speed

Page 47: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Conclusions

Page 48: Toy to practical interpreter Mosh intenals Shibuya.Lisp2009/02/28

Toy to Practical is not easy.

Wear a beard!Please try Moshhttp://code.google.com/p/mosh-scheme/

trunk is better