hongtao yu zhaoqing zhang xiaobing feng wei huo

26
Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei }@ict.ac.cn 1 Jingling Xue University of New South Wales [email protected]

Upload: herve

Post on 23-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code. Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei }@ict.ac.cn. Jingling Xue - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

Level by Level: Making Flow- and Context-Sensitive Pointer

Analysis Scalable for Millions of Lines of Code

Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences

{ htyu, zqzhang, fxb, huowei }@ict.ac.cn

1

Jingling XueUniversity of New South Wales

[email protected]

Page 2: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Outline• Introduction• Framework• Analyzing a Level• Experiments• Conclusion

2

Page 3: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Introduction• Motivation

– Who needs flow- and context-sensitive (FSCS) pointer analysis ?• Software checking tools• Program understanding• Parallelization tools • Hardware synthesis

– Existed methods cannot scale to large real programs

– Aiming at millions of lines of C code

3

Page 4: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Improve scalability• For flow-sensitivity

– Decreasing iterations in dataflow analysis– Saving space of points-to graph

• For context-sensitivity– Summary-based– Low storage penalty– Low apply penalty

4

Page 5: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Idea• Level by Level analysis

– Analyze the pointers in decreasing order of their points-to levels• Suppose

int **q, *p, x;q has a level 2, p has a level 1 and x has a level 0.

– Fast flow-sensitive analysis on full sparse SSA– Fast and accurate context-sensitive analysis

using a full transfer function

5

Page 6: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Contribution• performs a full-sparse flow-sensitive pointer

analysis using a flow-insensitive algorithm• performs a context-sensitive pointer analysis

efficiently with precise full transfer function• yields a flow- and context-sensitive

interproce-dural may/must mod/ref on a compact SSA form

• analyzes million lines of code in minutes, fast-er than the state-of-the art FSCS pointer ana-lysis algorithms

6

Page 7: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Framework

Figure 1. Level-by-level pointer analysis (LevPA).

Evalute transfer functions

Bottom-up Top-down

Propagate points-to set

Compute points-to

level

for points-to level from the highest to lowest

incremental build call graph

7

Page 8: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Points-to level• Property 1. If a variable x is possibly

pointed to by a pointer y, then ptl(x) ≤ ptl(y).

• Property 2. If a variable y is possibly assigned to x, then ptl(x) = ptl(y).

• Compute points-to level by a Unification-based pointer analysis

8

Page 9: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Exampleint o, t;main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj;}

9

ptl(x, y, p, q) =2ptl(a, b, c, d, e) =1 ptl(t, o) = 0

analyze first { x, y, p, q } then { a, b, c, d, e} last { t, o }

Page 10: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 2void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; }

main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

10

Page 11: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 2void foo( int **p, int **q) { L11: *p1 = *q1; L12: *q1 = &obj; }

main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

11

• p1’s points-to depend on formal-in p• q1’s points-to depend on formal-in q

Page 12: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 2void foo( int **p, int **q) { L11: *p1 = *q1; L12: *q1 = &obj; }

main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x1 = &a; y1 = &b; L4: foo(x1, y1); L5: *b = 5; L6: if ( … ) { x2 = &c; y2 = &e; } L7: else { x3 = &d; y3 = &d; } x4=ϕ (x2, x3); y4=ϕ (y2, y3) L8: c = &t; L9: foo( x4, y4); L10: *e = 10; }

12

• p1’s points-to depend on formal-in p• q1’s points-to depend on formal-in q

• x1 → { a }• y1 → { b }• x2 → { c }• y2 → { e }• x3 → { d }• y3 → { d }• x4 → { c, d }• y4 → { e, d }

Page 13: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Full-sparse Analysis• Achieve flow-sensitivity flow-insensitively

– Regard each SSA name as a unique variable– Set constraint-based pointer analysis

• Full sparse– Saving time– Saving space

13

Page 14: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Top-down analyze level 2

L4:foo.p → { a }foo.q → { b }

L9:foo.p → { c, d }foo.q → { d, e }

• foo.p → { a, c, d }• foo.q → { b, d, e }

main: Propagate to callsite

14

void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; }

main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

Page 15: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Top-down analyze level 2

void foo( int **p, int **q) { μ(b, d, e) L11: *p1 = *q1; χ(a, c, d) L12: *q1 = &obj;

χ(b, d, e) }

foo: Expand pointer dereferences

15

Merging calling contexts here

void foo( int **p, int **q) { L11: *p = *q; L12: *q = &obj; }

main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x = &a; y = &b; L4: foo(x, y); L5: *b = 5; L6: if ( … ) { x = &c; y = &e; } L7: else { x = &d; y = &d; } L8: c = &t; L9: foo( x, y); L10: *e = 10; }

Page 16: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Context Condition • To be context-sensitive• Points-to relation ci

– p ⟹ v (p→v ) , p must (may) point to v, p is a formal parameter.

• Context Condition ℂ(c1,…,ck)– a Boolean function consists of higher-level points-to

relations• Context-sensitive μ and χ

– μ(vi, (cℂ 1,…,ck))– vi+1=χ(vi, M, (cℂ 1,…,ck))

• M {may, must∈ }, indicates weak/strong update

16

Page 17: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Context-sensitive μ and χvoid foo( int **p, int **q) { μ(b, q⟹b)

μ(d, q→d) μ(e, q→e)

L11: *p1 = *q1; a=χ(a , must, p a)⟹ c=χ(c , may, p→c) d=χ(d , may, p→d)L12: *q1 = &obj; b=χ(b , must, q b)⟹ d=χ(d , may, q→d) e=χ(e , may, q→e)}

17

Page 18: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 1void foo( int **p, int **q) {

μ(b1, q⟹b) μ(d1, q→d) μ(e1, q→e)

L11: *p1 = *q1; a2=χ(a1 , must, p⟹a) c2=χ(c1 , may, p→c)

d2=χ(d1 , may, p→d)L12: *q1 = &obj; b2=χ(b1 , must, q⟹b) d3=χ(d2 , may, q→d) e2=χ(e1 , may, q→e)}

18

Page 19: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Points-to Set• Local Points-to Set

– Loc (p) = { <v, (cℂ 1,…,ck)> | (cℂ 1,…,ck) is a context condition}.

– p can point to v if and only if (cℂ 1,…,ck) holds.– is computed explicitly during the bottom-up analysis.

• Dependence Set– Dep(p) = { <q, (cℂ 1,…,ck)> | q is a formal-in parameter

of level lev and (cℂ 1,…,ck) is a context condition– Ptr(p) includes Ptr(q) if and only if (cℂ 1,…,ck) holds.

19

Page 20: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Transfer function• Trans(proc, v)

– < Loc(v), Dep(v), (cℂ 1,…,ck), M > • v is a formal-out parameter• ℂ(c1,…,ck) is a context condition.

– V can be modified at a callsite invoking proc only if (cℂ 1,…,ck) holds at the callsite

• M {may, must∈ } , – indicates may/must mod effect

• Trans(proc) – a set of all individual transfer functions

Trans(proc, v).

20

Page 21: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 1void foo( int **p, int **q) {

μ(b1, q⟹b) μ(d1, q→d) μ(e1, q→e)

L11: *p1 = *q1; a2=χ(a1 , must, p⟹a) c2=χ(c1 , may, p→c) d2=χ(d1 , may, p→d)L12: *q1 = &obj; b2=χ(b1 , must, q⟹b) d3=χ(d2 , may, q→d) e2=χ(e1 , may, q→e)}

• Trans(foo, a) = < { }, { <b, q⟹b> , < d, q→d>, < e, q→e>} , p a⟹ , must >

21

• Trans(foo, c) = < { }, { <b, q⟹b> , < d, q→d>, < e, q→e>} , p→c, may >

• Trans(foo, b) = < {< obj, q⟹b> }, { } , q b⟹ , must >

• Trans(foo, e) = < {< obj, q→e> }, { } , q→e, may >

• Trans(foo, d) = < {< obj, q→d> }, { <b, p→d q∧ ⟹b> , < d, p→d>, < e, p→d

q∧ →e> } , p→d q∨ →d, may >

Page 22: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Bottom-up analyze level 1int obj, t;main() { L1: int **x, **y;

L2: int *a, *b, *c, *d, *e; L3: x1 = &a; y1 = &b; μ(b1, true) L4: foo(x1 , y1 ); a2=χ(a1 , must, true) b2=χ(b1 , must, true)

L5: *b1 = 5; L6: if ( … ) { x2 = &c; y2 = &e; } L7: else { x3 = &d; y3 = &d; } x4=ϕ (x2, x3) y4=ϕ (y2, y3) L8: c1 = &t; μ(d1, true) μ(e1, true) L9: foo(x4 , y4); c2=χ(c1, may , true) d2=χ(d1, may , true) e2=χ(e1, may , true) L10: *e1= 10; }at L4,

p ⟹ a holds, q ⟹ b holds

at L9, p → c, p → d holds,q → e, q → d holds,

22

Page 23: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

BDD and context condition• Context conditions are implemented using BDD

– Compactly represented – Boolean operations efficiently

23

x1

x2

x3

0 1

01

0

110

variable x1 represents p→a

variable x2 represents q→a

variable x3 represents p→b

BDD for = (ℂ p → a q → a) p → b∧ ∨

if only p → b holds at a call site, we can write ℂ |x1=0;x2=0;x3=1 to see whether C holds at the call site.

Page 24: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Experiment• Analyzes million lines of code in minutes• Faster than the state-of-the art FSCS pointer analysis

algorithms.

Table 2.  Performance (secs).

24

Benchmark KLOCLevPA Bootstrapping(PLDI’08

)

64bit 32bit 32bit

Icecast-2.3.1 22 2.18 5.73 29

sendmail 115 72.63 143.68 939

httpd 128 16.32 35.42 161

445.gombk 197 21.37 40.78 /

wine-0.9.24 1905 502.29 891.16 /

wireshark-1.2.2 2383 366.63 845.23 /

Page 25: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Conclusion• We present a scalable method for flow- and

context-sensitive pointer analysis• Analyzes the pointers in a program level by

level in terms of their points-to levels. – Fast flow-sensitive analysis on full sparse SSA form – Fast and accurate context-sensitive analysis using

full transfer functions represented by BDD. • Can analyze million lines of C code in minutes,

faster than the state-of-the-art methods.

25

Page 26: Hongtao Yu    Zhaoqing Zhang    Xiaobing Feng    Wei Huo

INSTITUTE OF COMPUTING TECHNOLOGY

Thanks

26