design and implementation of the joeq virtual machine sun microsystems labs mountain view, ca john...
Post on 21-Dec-2015
215 views
TRANSCRIPT
Design and Design and Implementation of the Implementation of the Joeq Virtual MachineJoeq Virtual Machine
Sun Microsystems LabsMountain View, CA
John WhaleyStanford University
August 26, 2003
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
2
About me
• Worked on Java VMs since JDK 1.0– 1996: Extended AWT to support pen input– 1997: Clean-room Java VM written in C++– 1998: Jalapeno: designed opt compiler, …– 1999: MIT Flex: dataflow framework, etc.– 2000: IBM Tokyo JIT: x86 performance– 2001: joeq virtual machine
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
3
Key Features
• Implemented in 100% Java– Includes native methods to manipulate
addresses, memory, registers directly.
• Native vs. hosted execution– Native: run directly on hardware– Hosted: run on top of another VM
• Bootstrap to native via reflection• Supports both GC and explicit
deallocation
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
4
Key Features
• Compiler and program analysis framework
• Multiple languages: Java, C, C++, …– Single intermediate representation
• Static, quasi-static, and dynamic compilation– Single unified compiler infrastructure
• Online and offline profiling system• M:N thread scheduler
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
5
Motivation/Purpose
• Started Ph.D. studies, needed a research infrastructure
• Purpose:– Try out new ideas– Do research– Publish papers
• Not out to:– Compete with other VMs– Make a shippable product– Change the world
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
6
Other Options
• SUIF– Written in C++– Limited support for Java– No dynamic compilation or runtime system– EDG frontend: not 100% gcc compatible
• Jalapeno– Written in Java– Very familiar with the system– Supports Java only– Not available outside of IBM
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
7
Other Options
• MIT Flex compiler– Written in Java– Familiar with system– Open-source GPL– Statically-compiled Java only
• Kaffe, etc.– Written in C– Poor design, poor performance
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
8
Why Another VM?
• General problem with established projects:– Established users and code base
made it difficult to make major changes.
– Wanted to fix the design "mistakes" of Jalapeno and MIT Flex compiler
– More productive in Java than in C++
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
9
Design Goals
• Ease of trying out new research ideas– Implemented in Java– Modularity.– Lots of reusable code, use of software
patterns.• Support Java and C/C++
– A single intermediate representation– Support GC and explicit deallocation
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
10
Design Goals
• Support static, quasi-static, dynamic compilation.– Unified compiler framework.– Compiler implemented in Java.– Allow "maybe" responses due to
incomplete information.– General code patching mechanism.– Profile framework allows online/offline
profiling.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
11
Design Goals
• Get something up and running quickly.– Make compiler, runtime easy to debug– Hijack class libraries from running VM– LGPL: can borrow code from other open-
source projects– Goal: Self-bootstrapping after one month
• Make it available for others to use.– Documentation, etc.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
12
Not Design Goals
• Performance leader– An endless pit, takes a lot of effort– Performance just needs to be
“reasonable”– Should be designed for good
performance if someone wanted to put in the effort
• 100% conformance to specification– If programs work, that’s good enough.– No access to good test suites, anyway.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
13
ELF objectfile
IRQuad
Controller
Profiler
Quadbackend
Bytecodebackend
BytecodeIR
SUIF fileloader
SUIF toQuad
Bytecodedecoder
Compiled codeplus metadata
Profile datafile
Object filedata section
Executable codein memory
ELF filecode section
COFF filecode section
Garbagecollector
Memoryheaps
Thread scheduler,synchronization,
stack walker
type checking
Introspection,verification,
Systeminterface
Class/membermetadata
Optimizationsand analyses
Bytecode/Quadinterpreters External
libraries
Java classfile
Java classfile loader
Disassembleto Quad
ELF binaryloader
FRONT-END
SUIF file
COMPILER DYNAMIC
BACK-END
INTERPRETER
MEMORY MANAGER RUN-TIME SUPPORT
Bytecode toQuad
System Overview
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
14
Consequences of 100% Java
• Implementation purity– Self-applicable– VM code is great for program analysis, makes a
great test suite
• Portability– >95% of the code is system-independent– Hosted execution
• Easier software engineering– Exceptions, GC, software patterns, existing
tools
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
15
Consequences of 100% Java
• Java is not a panacea of portability– Hosted execution works OK on most
VMs– Native bootstrapping is horribly VM-
dependent• Internal class library changes cause Joeq
to break
– Supporting multiple JDK versions is difficult
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
16
Bootstrapping technique
• Use reflection and code analysis to determine root set of methods and objects
• Dump the objects and code into an object file (COFF or ELF format)
• Use a standard linker to generate an executable
• Easy support for static and quasi-static compilation, cross-language calls, dynamic linking, etc.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
17
Bootstrapping trickiness
• Custom class loaders– Have to hijack class loader and wrap it
• Files, etc. must be reinitialized– Some state stored in native code
• Objects created during image write– Finalizer threads, reflection caches,
character encodings, …
• Reflection doesn’t work on all objects– Throwable backtrace, ThreadLocal, etc.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
18
Consequences of bootstrapping technique
• Standard file formats very useful– Use existing tools and debuggers
• Big startup time improvement on applications (30x)– Skips all of the initialization code, JIT
startup costs• Large object files, number of
relocations cause problems with some tools.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
19
Consequences of bootstrapping technique
• Automatic discovery of necessary code: time-consuming, too conservative.
• Hardwired class list: smaller and faster, but breaks often.
• Problem: Instantiating an object means class is initialized, which brings in class initializer and many more objects
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
20
Consequences of bootstrapping technique
• Bootstrapping process is a major pain– Time-consuming: reflection is
inefficient– Difficult to debug– Process breaks with different JDK
versions, environment variables, command line options, locales, etc.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
21
Class library implementation
• GNU Classpath: too incompatible, too buggy
• Hijack Sun class library by class merging– Make a “mirror” class with the same name.– Special class loader merges the classes.
• Easy implementation of native methods.– Native code is just normal Java code.
• Perfect compatibility, easy updates
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
22
Consequences of mirror classes
• Types don’t match, so javac complains– Cast to java.lang.Object, then back down.
• Doesn’t work on different class libraries.• Many changes between subversions.
– Use a hierarchy of mirror classes
• Incompatible changes lead to many hacks.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
23
Multiple language support
• Joeq has support for:– Java class files– SUIF files
• C, C++, Fortran, …
– x86 object code
• All are translated into a single intermediate representation, the Quad.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
24
Quad intermediate representation
• Analyses and optimizations are instantly applicable to all languages
• Cross-language inlining and optimization– Elimination of JNI overhead
• Support for raw address manipulation in Java falls out naturally
• Type-accurate garbage collection for well-behaved C/C++ programs
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
25
Quad intermediate representation
• Generic interfaces for operators– Lots of shared code
• Types are optional– Type analysis will construct type
information• Doesn’t support all esoteric C/C++
features– Computed labels, C++ nastiness, etc.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
27
Memory management
• Memory management is abstracted into different heaps– Each heap has its own
allocation/deallocation policy• Interface for querying garbage
collection policies– Type-accurate, semi-accurate, conservative– GC-safe points or at any instruction– Thread-local allocation pools
• Working out an interface with JMTk
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
28
Consequences of memory management framework
• Debugging– Run under hosted execution mode– Image snapshots– 100% type-accurate is hard
• Coordinating threads for GC– Making a general interface is tricky
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
29
Thread scheduler
• M:N thread scheduler– Lightweight Java threads– Thread switch at any instruction– Uses local thread queues and work-stealing
• Timer ticks by using setitimer interrupts (Linux) or a separate thread (Windows)
• Thread-local information stored off of fs register
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
30
Consequences of Java thread scheduler
• Accessing threads in a machine-independent way is not easy
• Linux pthread implementation is broken– Lots of bugs, race conditions, inefficiencies– Changing stack pointer is not always
supported– Use of fs register is not always supported
• Windows support is much nicer (?)
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
31
Running an Open-Source Project
• Lots of interest, but very few people actually follow thru
• Not many people have the skills– Of those, not many have the time
• Of those, even fewer have the perseverance– The result is that there have only been minor
contributions by others
• Documentation, testing, file releases, updating the web site all take time.
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
32
Running an Open-Source Project
• What’s needed:– Nightly build scripts and regression
testing– Implementation hackers– People interested in GC
August 26, 2003 Design and Implementation of the Joeq Virtual Machine
33
Conclusion: What I’ve learned
• Software patterns are useful– Joeq: 100K lines of code
• Modular design is key– Trying out new type checker: ~2 hours
• For maximum efficiency, design the system to be easily debuggable.
• Preemptively eliminate obvious problems.
• Its more fun to write code when you also write the compiler.