fast points-to analysis for languages with structured types michael jung and sorin a. huss...

13
Fast Points-to Analysis for Fast Points-to Analysis for Languages with Structured Languages with Structured Types Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer Science Technische Universität Darmstadt, Germany

Upload: helena-walsh

Post on 03-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Fast Points-to Analysis for Fast Points-to Analysis for Languages with Structured TypesLanguages with Structured Types

Michael Jung and Sorin A. Huss

Integrated Circuits and Systems Lab.

Department of Computer Science

Technische Universität Darmstadt, Germany

Page 2: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

OutlineOutline

A points-to analysis is applied to answer questions like: „Which variables might be accessed by expression *a->b?“

• Motivation – Why is this of benefit to know?

• Code Motion

• Partial Evaluation

• Design decisions for a points-to analysis

• Concepts of Steensgaard‘s points-to analyses

• Differences between proposed and original PTA

• Results

Page 3: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

MULT r1, b, cSTORE r1, (a)LOAD r2, (e)ADD d, r2, f

Motivation – Code MotionMotivation – Code Motion

...

*a = b * c; d = *e + f;

LOAD r2, (e)MULT r1, b, cSTORE r1, (a)LOAD r2, (e)ADD d, r2, f

Optimize CPUutilization bycode motion•RAW hazard on r2

•LOAD is multi cycle•Pipeline stalled

a = &g; e = &g;*a = b * c; d = *e + f;

RAW violation

• If the sets of locations a and e may point to are disjoint → optimization is correct

• Points-to analysis computes conservative approximation of these sets

• ANSI C restrict type qualifier:void f(int n, int * restrict p, int * restrict q) {

while (n--) *p++ = *q++;}

Page 4: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Motivation – Partial EvaluationMotivation – Partial Evaluation

int power(int x, unsigned n) { int r=1; while (n) { if (n&1) r=r*x; x=x*x; n=n>>1; } return r;}

x : dynamicn : static

void power_gen(unsigned n) { printf(„void power(int x, unsigned n) {\n“); printf(„ int r=1;\n“); printf(„ assert(n==%d);\n“, n); while (n) { if (n&1) printf(„ r=r*x;\n“); printf(„ x=x*x;\n“); n=n>>1; } printf(„ return r;\n}“);}

n = 3

void power(int x, unsigned n) { int r=1; assert(n==3); r=r*x; x=x*x; r=r*x; x=x*x; return r;}

Relation to Points-to Analysis:

Can expression *y statically be evaluated?

Page 5: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

int *p, a, b;

if (a) { p = &a; } else { p = &b;

}

b = *p;

int *p, a, b;

if (a) { p = &a; f(p);} else { p = &b; f(p);}

b = *p;

Storage Shape GraphsStorage Shape Graphs

pa

b

PT(*p)={a,b}

• PTA computes a storage shape graph• undecidable -> conservative approx.

• Tradeoff between efficiency and accuracy• flow sensitivity

pb

PT(*p)={b}

pa

PT(*p)={a}

• context sensitivity• complexity of storage shape graphp a,b

PT(*p)={a,b} • Steensgaard‘s first algorithm [1]:• flow insensitive• context insensitive• graph complexity O(n)• almost linear time complexity• 80/20 rule: 80% benefit, 20% cost [2]

[1] B. Steensgaard. Points-to analysis in almost linear time. In Symposium on Principles of Programming Languages, 1996

[2] M.Hind and A.Pioli. Which pointer analysis should I use? In International Symposium on Software Testing and Analysis, 2000

Page 6: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Concepts of Steensgaard‘s PTAConcepts of Steensgaard‘s PTA

a b c

d e

a b c

d e

a b,d c

e

a b,d c,e a = &d;

• Keeping the storage shape graph O(N)

Disjoint-set forests ([2]) => Join is O(α(N,N)) => Analysis O(Nα(N,N))

[2] R.Tarjan. Efficiency of a good but not linear set union algorithm. Journal of the ACM, 1975

• Data flow direction (Pending Joins)

a=&b;a=c;

c

a b

c=&d;

b,d

NIL Join( )

Page 7: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Points-to Analysis Comparison IPoints-to Analysis Comparison I

object

simple struct

blank

[4] 4 types of nodes:

s->a = &a;s->b = &b;*(*int)s =

c;s

a

bs a,b

structobject

Single kind of representation: Node with fields

simple

blank struct

Contra:• Memory layout dependent

Pro:• Conceptually simpler

(algorithm less complex)• More precise in case of

inconsistent access

[4] B.Steensgaard. Points-to Analysis by type inference of programs with structures and unions. In Computational Complexity, 1996

simple

simple

Page 8: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Points-to Analysis Comparison IIPoints-to Analysis Comparison II

struct { int *a, *b, *c } s;int **d, f, g, h;

s.a = &f;s.b = &g;s.c = &h;

d = &s->a;d = &s->b;

d

f

g

h

s

s

d

f

g

h

s

d

f

g

h

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d

f

g

h

s

d s f,g,h d f,g

h

d

h

f

g

Points-to(**d) = { f, g, h } Points-to(**d) = { f, g }

Steensgaard‘s Analysis [4] Proposed Analysis

1. Graph Initialization:One node per variable

- Join nodes as necessary

- Establish links

2. Iterate over statements

Page 9: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

Data StructuresData Structures

),(

),(

),,,(

,...,1

ulp

soq

pqf

ff n

f

f

os

l

u

Abstract locations

Fields

Field extents

Pointer offset range

Storage shape graphs are composed of

Page 10: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

RelationsRelations

Intervals-overlap rel.:

Sub-interval relation:

Field inclusion:

Assignment data flow:

2o

2s1o

1s

field extents

1o

1s 1s

pointer offset ranges

1o

2o

2s

1o

1s

field extents

1o

1s 1s

pointer offset ranges

1o

2o

2s

1o

1s

1l

1u

2l

2u

2

*a =s *b;

2u1l

1us

b1

as

2

Page 11: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

2:y3

us

Constraints Deduction ExampleConstraints Deduction Example

x = *y;

1:xs

s

Page 12: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

ResultsResults

number of variables aliased

benchmark

LOC 1 2 3 4 5 6 7 20 22 270

bc 6,098 5 1 1

espresso 11,691 17 10 1 1 1 1

li 7,355 3 1 1 1 2 1

Twelve benchmarks from Todd Austin‘s as well as from the SPEC92 benchmarks. Excerpt:

• Benchmarks common for Points-To Analyses

• Hard to compare

• Steensgaard‘s second paper does not report on this kind of results

...

Page 13: Fast Points-to Analysis for Languages with Structured Types Michael Jung and Sorin A. Huss Integrated Circuits and Systems Lab. Department of Computer

ConclusionsConclusions

We propose an improvement to Steensgard‘s PTA, of which we feel confident that it is• more precise than,• as fast as,• conceptually simpler and thus • easier to implement than the original.

Thanks for your attention!