Download - Spring 2014Jim Hogg - UW - CSE - P501U-1 CSE P501 – Compiler Construction Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting

Spring 2014 Jim Hogg - UW - CSE - P501 U-1

CSE P501 – Compiler Construction

Overview of SSA IR

Constructing SSA graphs SSA-based optimizations Converting back from SSA form

Appel ch. 19 Cooper-Torczon sec. 9.3

Def-Use (DU) Chains

Common dataflow analysis problem:

Find all sites where a variable is used ( =x ) Find all sites where that variable was last defined (x= )

Traditional solution:

Def-Use (DU) chains – additional data structure on top of flowgraph

Link each def (assigns-to) of a variable to all uses

Use-Def (UD) chains Link each use of a variable to its def


Def-Use (DU) Chains


z>1

x=1z>2

x=2

z=x-3x=4

z=x+7

y=x+1

exit

entry

Two DU chains intersect

DU-Chain Drawbacks

Expensive

if a typical variable has D defs and U uses, total cost is O(D*U)

ie: O(n2) where n is number of statements in function Better if cost varied as O(n)

Unrelated uses of same variable are mixed together. This complicates analysis. Eg:

Variable v is used over and over; but uses are disjoint Register allocated might think it is live across all uses


SSA: Static Single Assignment

IR form where each variable has only one def in the function

Equivalent, in a source program, to never using same variable for different purposes

static single definition. But that def can be in a loop that is executed dynamically many times

In a loop, it really is the same variable each time, so makes sense

Separates values from where they are stored in memory


int x = 0;while (x < a.length) a[x] = i+7;

for(x = 42; x >= 0; --x) b[x] = a[x] * 3;

int x = 0;while (x < a.length) a[x] = i+7;

for(int x1 = 42; x1 >= 0; --x1) b[x1] = a[x1] * 3;

After: Manual, source SSA format

Before

SSA in Basic Blocks

a = x + yb = a – 1a = y + bb = x * 4a = a + b

a1 = x0 + y0

b1 = a1 – 1

a2 = y0 + b1

b2 = x0 * 4

a3 = a2 + b2


We’ve seen this before when we looked at Value Numbering

Original SSA

• Bump the SSA number each time a variable is def'd• x0 is the value assigned before the current function - eg: input

parameter

Merge Points

Main issue How to handle merge points in the

flowgraph?

Solution introduce a Φ-function a3 = Φ(a1, a2)

Meaning a3 is assigned either a1 or a2 depending on

which control path was used to reach the Φ-function


Example


b = M[x]a = 0

if b < 4

a = b

c = a + b

Original

b1 = M[x0]a1 = 0

if b1 < 4

a2 = b1

a3 = Φ(a1, a2)c1 = a3 + b1

SSA

How Does Φ know what to Pick?

It doesn’t

When we translate the program to executable form, we add code to copy either value to a common location on each incoming edge

For analysis, all we may need to know is the connection of uses to defs – no need to “execute” anything

Spring 2014 U-9

a2 = a1 =

a3 = Φ(a1, a2)

a2 =a = a2

a1 =a = a1

a3 = Φ(a1, a2) = a

Before After

Example With Loop


a = 0

b = a + 1c = c + ba = b * 2if a < N

return c

Originala1 = 0

a3 = Φ(a1, a2)b1 = Φ(b0, b2)c2 = Φ(c0, c1)b2 = a3 + 1c1 = c2 + b2

a2 = b2 * 2if a2 < N

return c1

SSA

Notes:• a0, b0, c0 are initial values of a, b, c• b1 is dead – can delete later• c is live on entry – either input

parameter or uninitialized

What does SSA 'buy' us?

Spring 2014 U-11

z>1

x=1z>2

x=2

z=x-3x=4

z=x+7

y=x+1

exit

entry

z1>1

x1=1z1>2

x2=2

x3=φ(x1,x2

)z2=x3-3x4=4

z3=x4+7

y=x1+1

exit

entry

• No need for DU or UD• Implicit in SSA• Compact representation

• SSA form is "recent" (80s)• Prevalent in real compilers for {}

languages

Converting To SSA Form

Basic idea

First, add Φ-functions

Then rename all defs and uses of variables by adding subscripts


Inserting Φ-Functions

Add Φ-functions for every variable in the function, at every join point?

This is called "maximal SSA"

But Wastes lots of memory Slows compile time Not needed in many cases


Path-convergence criterion

Insert a Φ-function for variable a at block B3 when:

There are blocks B1 and B2, both containing definitions of a, and B1 B2

There are paths from B1 to B3 and from B2 to B3

These paths have no common nodes other than B3

B3 is not in both paths prior to the end (eg: loop), altho' it may appear in one of them

Spring 2014 U-14

a1 =

a3 = Φ(a1, a2)

a2 = B2B1

B3

Details

The start node of the flowgraph is considered to define every variable (a0, b0, etc) even if to “undefined”

Each Φ-function itself defines a variable, so keep adding Φ-functions until things converge


Dominators and SSA

One property of SSA is that defs dominate uses. More specifically:

If an = Φ(… ai …) is in block B, then the definition of ai dominates the ith predecessor of B

If a is used in a non-Φ statement in block B, then the definition of a dominates block B

Spring 2014 U-16a3 = Φ(a1, a2) B

a2

a1 =

Dominance Frontier (1)

To get a practical algorithm for placing Φ-functions, we need to avoid looking at all combinations of nodes leading to a join-node

Instead, use the dominator tree in the flow graph


Dominance Frontier (2)

Definitions

x strictly dominates y if x dominates y and x y x dom y && x y => x sdom y

The dominance frontier of x is the set of all nodes z such that x dominates a predecessor of z, but x does not dominate z

Dominance Frontier is the border between dominated and undominated nodes; it passes thru the undominated set

Leave x in all directions; stop when you hit undominated nodes


x

z

y

x dom {x, y}x sdom {y}z DF(x)

Example


1

2

3

4

13

5

6 7

8

9

10

11

12

Node 5 dominates ...


1

2

3

4

13

5

6 7

8

9

10

11

12

5 dom {5,6,7,8}5 sdom {6,7,8}

Node 5 Dominance Frontier ... DF(5)


1

2

3

4

13

5

6 7

8

9

10

11

12

5 dom {5,6,7,8}5 sdom {6,7,8}

DF(5) excludes 5 itself (backedge 85) because 5 !sdmon 5

x

z

y

DF(x) = {z | y pred(z), x dom y, x !dom z, x !sdom z }

Dominance Frontier Criterion

If node x def's variable a, then every node in DF(x) needs a Φ-function for a

Since the Φ-function itself is a def, this needs to be iterated until it reaches a fixed-point

Theorem: this algorithm places exactly the same set of Φ-functions as the path criterion given previously


Placing Φ-Functions: Details

1. Compute the DF for each node in the flowgraph

2. Insert just enough Φ-functions to satisfy the criterion. Use a worklist algorithm to avoid revisiting nodes

3. Walk the dominator tree and rename the different definitions of variable a to be a1, a2, a3, …


Efficient Dominator Tree Computation

Goal: SSA makes optimizing compilers faster since we can find defs/uses without expensive bit-vector algorithms

So, need to be able to compute SSA form quickly

Computation of SSA from dominator trees is efficient, but…


Lengauer-Tarjan Algorithm

Iterative, set-based algorithm for finding dominator trees is slow in worst case

Lengauer-Tarjan is near linear time (in number of nodes)

Uses depth-first spanning tree from start node of control flow graph

See books for details


SSA Optimizations

Given the SSA form, what can we do with it?

First, what do we know? - what info is kept in the SSA graph?


SSA Data Structures

Instruction links to containing block, next and previous instructions,

variables defined, variables used. Instruction kinds: ordinary, Φ-function, fetch, store, branch

Variable link to def (instruction) and use sites

Block list of contained instructions, ordered list of predecessors,

successor(s)


Dead-Code Elimination (DCE)

A variable is live <=> list of uses is not empty

To remove dead code:while (there is some variable v with no uses) { if (instruction that def's v has no side effects) { delete instruction }}

Need to remove this instruction from list of uses for its operands: may kill those variables too


Many optimizations become easy to recognize in SSA form. Eg:

Simple Constant Prop.

In v = c, if c is a constant, then: Then update every use of variable v to use constant c instant

In v = Φ(c1, c2 … cn), if all ci’s have same constant value c, replace instruction with v = c

Incorporate copy prop, constant folding, and others in the same worklist algorithm


Simple Constant Propagation

W = list of all instruction in SSA function

while W is not empty {remove some instructions I from Wif I is v = Φ(c, c, …, c), replace I with v = cif I is v = c {

delete I from the function foreach instruction T that uses v { substitute c for v in T add T to W

} }}


Converting Back from SSA

Unfortunately, real machines do not include a Φ instruction

So after analysis, optimization, and transformation, need to convert back to a “Φ-less” form for execution


Translating Φ-functions

The meaning of x = Φ(x1, x2 … xn) is: set x = x1 if arriving on edge 1 set x = x2 if arriving on edge 2 etc

So, foreach i, insert x = xi at the end of predecessor block i

Rely on copy prop. and coalescing in register allocation to eliminate redundant moves


SSA Wrapup

More details in recent compiler books (but not the new Dragon book!)

Allows efficient implementation of many optimizations

Used in many new compiler (eg: LLVM) & retrofitted into many older ones (GCC)

Not a silver bullet – some optimizations still need non-SSA forms


Download - Spring 2014Jim Hogg - UW - CSE - P501U-1 CSE P501 – Compiler Construction Overview of SSA IR Constructing SSA graphs SSA-based optimizations Converting

Top Related