chap. 10, intermediate representations j. h. wang dec. 14, 2015

30
Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Upload: denis-wilson

Post on 19-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Chap. 10, Intermediate Representations

J. H. WangDec. 14, 2015

Page 2: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Outline

• Overview• Java Virtual Machine• Static Single Assignment Form

Page 3: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Overview

• Ch.7: AST• Ch.8-9: Semantic analysis• Ch.10: Intermediate representation• Ch.11: Code synthesis for virtual

machines• Ch.12: Runtime support• Ch.13: Target code generation

Page 4: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Overview

• Semantic gap between high-level source languages and target machine language

• Examples– Early C++ compilers

• cpp: preprocessor• cfront: translate C++ into C• C compiler

Page 5: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015
Page 6: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Another Example

• LaTeX– TeX: designed by Donald Knuth– dvi: device-independent intermediate

representation– Ps: PostScript– pixels

• Portability enhanced

Page 7: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015
Page 8: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Challenges

• Challenges– An intermediate language (IL) must be

precisely defined– Translators and processors must be

crafted for an IL– Connections must be made between

levels so that feedback from intermediate steps can be related to the source program

• Other concerns– Efficiency

Page 9: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

The Middle-End

• Front-end: parser• Back-end: code generator• Middle-end: components between front-

and back-ends• Compiler suites that host multiple source

languages and target multiple instruction sets obtain great leverage from a middle-end– Ex: s source languages, t target languages

• s*t vs. s+t

Page 10: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015
Page 11: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Additional Advantages

• An IL allows various system components to interoperate by facilitating access to information about the program– E.g. variable names and types, and source line

numbers could be useful in the debugger• An IL simplifies development and testing

of system components• The middle-end contains phases that

would otherwise be duplicated among the front- and back-ends

• It allows components and tools to interface with other products

Page 12: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

• It can simply the pioneering and prototyping of news ideas

• The ILs and its interpreter can serve as a reference definition of a language

• Interpreters written for a well-defined IL are helpful in testing and porting compilers

• An IL enables the crafting of a retargetable code generator, which greatly enhances its portability– Pascal: P-code– Java: JVM– Ada: DIANA (Descriptive Intermediate Attributed

Notation for Ada)

Page 13: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Java Virtual Machine

• Java class files: binary encodings of the data and instructions in a Java program

• Design principles (borrowed from JVM reference)– Compactness

• Instructions in nearly zero-address form– A runtime stack is used– Operands are implicit

» E.g.: iadd instructionL pops two items and pushes the sum onto TOS (tops of stack)

– A loss of runtime performance• Multiple instructions to accomplish the same effect

– To push 0 on TOS» iconst_0: 1 byte» ldc_w 0: 3 bytes

Page 14: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

– Safety• An instruction can reference storage only if

it’s of the type allowed by the instruction, and only if the storage is located in an area appropriate for access

• From security’s point of view, purely zero-address form is problematic

– The registers that could be accessed by a load instruction may not be known until runtime

– JVM: not zero-address» E.g. iload 5

• When a class file is loaded, many other checks are performed by the bytecode verifier

Page 15: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Contents of a Class File

• A class file is organized into sections called attributes that contain various information about the compiled class– Types: primitive and reference types– (Fig. 10.4)

• Primitive type: a single character• Reference type t: Lt;

– E.g.: String type in java.lang package: Ljava/lang/String;

• Array: [a

Page 16: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015
Page 17: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

– Constant pools• tagged union

– int, float, java.lang.String

• Referenced by its ordinal position, not byte-offset

– 1 byte for some instructions, e.g. ldc– 2 bytes for some instructions, e.g. ldc_w

Page 18: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

JVM Instructions

• Arithmetic• Register traffic• Registers and types• Static fields• Instance fields• Branching• Other method calls• Stack operations

Page 19: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Arithmetic

• Popping operands from the runtime stack, computing result, and pushing the result on TOS– E.g. iadd

• int: 32-bit, 2’s complement

– For other primitive types• fadd(float)• ladd(long)• dadd(double)

– Subtraction, multiplication, division, …

Page 20: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Register Traffic• JVM has an unlimited number of virtual

registers– Usually allocated in a method’s stack frame

• JVM registers typically host a method’s local variables– Registers starting from 0 are set aside for a

method’s parameters• JVM registers are untyped

– iload 2: push• iload_2: abbreviated (2 bytes)

– istore 10: pop– fload n: for float values– aload and astore: for reference types (32 bits

for object references)

Page 21: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Registers and Types

• Static analysis (or bytecode verification)– To ensure that values flow in and our of

registers without compromising Java’s type systems

• JVM appears to be stricter than Java language– E.g. Type conversion

• i2f: from 2’s complement to IEEE floating point format

Page 22: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Static Fields

• A class’s static fields are present in every instance of the class

• getstatic name type: push– E.g.: getstatic java/lang/System/out

Ljava/io/PrintStream;– Only 3 bytes in representation

• One: getstatic opcode• Two: 16-bit integer specifying a constant-

pool entry

• putstatic: pop

Page 23: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Instance Fields

• A class can declare instance field for which instance-specific storage is allocated

• getfield name type: push– E.g.: getfield Point/x I

• putfield: pop– putfield Point/x I

Page 24: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Branching

• Instructions to alter the control flow of the executing program– Unconditional

• goto: (3 bytes)• goto_w: 5 bytes

– Conditional branches• Comparison against 0: ifeq, ifne, iflt, ifle, ifgt,

ifge• Comparison of non-zero values: if_icmpeq,

if_icmpne, if_icmplt, if_icmple, if_icmpgt, if_icmpge

Page 25: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Static Method Calls

• Static methods: are common to all instances of some type t– E.g.: Math.pow(double a, double b)

• invokestatic – invokestatic java/lang/Math/pow(DD)D– Parameters pushed on the stack in left-

to-right order– 3 bytes

• Method signature: (DD)D

Page 26: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Instance-Specific Method Calls

• invokevirtual– invokevirtual

java/io/PrintStream/print(Z)V– An instance must be pushed on the

stack before the method’s parameters: this

Page 27: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Other Method Calls

• invokespecial• A constructor call is special

– An uninitialized reference to an object instance is pushed on TOS• <init> method

– No return value• Methods called by invokespecial are

dispatched based on the actual (runtime) type of the instance

• Invokespecial can also be used to invoke a private method, for efficiency reasons only

Page 28: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Stack Operations

• Instructions specifically for manipulating items near the TOS– To facilitate shorter instruction sequences

for common program fragments– dup2: duplicate the top two cells to

accommodate long and double types– dup: nicely accommodates multiple

assignments• x=y=z=value

– Pop, swap– dup_x1: in embedded assignment

Page 29: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Static Single Assignment Form

• (omitted)

Page 30: Chap. 10, Intermediate Representations J. H. Wang Dec. 14, 2015

Thanks for Your Attention!