a tour of pointer analysis - korea...

143
A Tour of Pointer Analysis Ondřej Lhoták

Upload: others

Post on 28-Jan-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • A Tour of

    Pointer AnalysisPointer Analysis

    Ondřej Lhoták

  • Pointer analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    [PASTE 2001]

  • What does pointer analysis do?

    For each pointer (reference) in the program, what

    memory locations (objects) does it point to?

    p

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    q

  • Why pointer analysis?

    a = 1

    b = 2

    c = a + b 3

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Why pointer analysis?

    a = 1

    b = 2

    *x = 4

    c = a + b ?

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Why pointer analysis?

    a = 1

    b = 2

    *x = 4

    c = a + b ?

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    If x == &a, then c = 6.

    If x == &b, then c = 5.

    If x != &a && x != &b, then c = 3.

  • Why pointer analysis?

    a = 1

    b = 2

    foo()

    c = a + b

    void foo() { ... }

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Why pointer analysis?

    a = 1

    b = 2

    x.foo()

    c = a + b

    class Y extends X { class Z extends X {

    class X {

    void foo() { ... }

    }

    ?

    ?

    ?

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    class Y extends X {

    void foo() { ... }

    }

    class Z extends X {

    void foo() { ... }

    }

    ?

  • Applications of pointer analysis

    • Call graph construction

    • Dependence analysis and optimization

    • Cast check elimination

    • Side effect analysis

    • Escape analysis• Escape analysis

    • Slicing

    • Parallelization

    • ...

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Pointer analysis as an abstraction

    For each pointer (reference) in the program, what

    memory locations (objects) does it point to?

    p

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    q

  • Pointer analysis as an abstraction

    For each pointer (reference) in the program, what

    memory locations (objects) does it point to?

    p

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    q

    α( ) =

  • Pointer analysis as an abstraction

    For each pointer (reference) in the program, what

    memory locations (objects) does it point to?

    p

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    q

    α( ) =α( p ) = p

  • Pointer analysis as an abstraction

    p

    q

    p

    q

    Concrete program execution Abstract analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    α( ) =

    q

    α( p ) =

    q

    p

  • Precision of points-to sets

    {L1} ⊂ {L1, L2} ⊂ {L1, L2, L3} ⊂ {L1, L2, L3, L4}

    ←more precise less precise →

    uncomputableunsound conservative

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    actual

    behaviour

    on all

    executions

    actual

    behaviour

    on some

    executions

    possible analysis results

  • Precision vs. efficiency

    Pre

    cise

    CFA trend

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    ← Efficient

    Pre

    cise→

    Theory

  • Precision vs. efficiency

    Pre

    cise

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    ← Efficient

    Pre

    cise→

  • Precision vs. efficiency

    Pre

    cise

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    ← Efficient

    Pre

    cise→

    Implementation tuning

  • Precision for a specific application

    {L1, L2} ⊂ {L1, L2, L3, L4}

    This points-to set is more precise because it is smaller.

    But suppose a particular application only cares whether

    L1 is in the set. Then for that application, both sets are

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L1 is in the set. Then for that application, both sets are

    equally precise.

    Thus, precision/efficiency tradeoff must consider the

    application.

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • An example abstraction and analysis

    •The abstraction:–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity

    First example:–without type filtering

    –field-sensitive

    –subset-based

    –ahead-of-time call graph

    –context-insensitive

    –flow-insensitive–Flow sensitivity

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    –flow-insensitive

  • Pointer Assignment Graph abstraction

    L1

    Abstract object node:

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Java:

    L1: x = new Object()

    C:

    L1: x = malloc(42)

    Represents some set of run-time objects (targets of pointers).

    e.g. all objects allocated at a given allocation site

    e.g. all objects of a given dynamic type

  • Pointer Assignment Graph abstraction

    &a

    Address-of abstract object node:

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Represents some set of run-time objects (targets of pointers).

    e.g. the object whose address is &a.

    C:

    x = &a

  • Pointer Assignment Graph abstraction

    p

    Pointer variable node:

    &aL1pt(p) = {L1, &a}

    Points-to set

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Represents some pointer-typed variable(s).

    e.g. all instances of the local variable p in method m.

  • Pointer Assignment Graph abstraction

    p.f

    Pointer dereference node:

    *p

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Java:

    y = p.f

    C:

    y = *p

    Represents a dereference of some pointer

    (where the pointer is a pointer variable node).

  • Pointer Assignment Graph abstraction

    L1.f

    Heap pointer node:

    *L1

    &aL2L2 L3pt(L1.f) = {L2, L3}

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Represents a pointer stored in some object in the heap.

  • State space (the analysis result)

    L1.f

    ppt( ) = { , , }L1 L2 &q

    pt( ) = { , , }L1 L2 &q

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    pt : (Var ∪ (Obj × Field)) →℘(Obj)

    pt : (Var × Obj) ∪ (Obj × Field × Obj)

    where Obj = Alloc ∪ AddrOf

  • State space (the analysis result)

    L1.f

    ppt( ) = { , , }L1 L2 &q

    pt( ) = { , , }L1 L2 &q

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • allocation L1: x = new Object()

    assignment x = y

    Pointer Assignment Graph edges

    L1 x

    y x

    store y.f = x

    load x = y.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    y.f x

    y.fx

  • Examplestatic void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    q

    r

    s.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    s

    s.f

    t

    Generate points-to assignment graph.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    q

    r

    s.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    s

    s.f

    t

    Propagate points-to sets.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    r

    s.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    s

    s.f

    t

    Propagate points-to sets.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    r

    s.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    sL1

    s.f

    t

    Propagate points-to sets.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    rL2

    s.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    sL1

    s.f

    t

    Propagate points-to sets.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    rL2

    s.f

    L1.f

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    sL1

    s.f

    t

    Add load/store edges.

  • Example

    L2

    p

    p.f

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    rL2

    s.f

    L1.fL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    sL1

    s.f

    tL2

    L2

    Re-propagate points-to sets.

  • Overall algorithmrepeat until no change {propagate abstract objects along edgesfor each load/store, add indirect edges to heap ptr nodes

    }

    add all allocation nodes to worklistwhile worklist not empty {remove node v1 from worklistfor each edge v1 -> v2, propagate pt(v1) into pt(v2)

    if v2 changed, add v2 to worklist

    Simple:

    Detailed:

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    if v2 changed, add v2 to worklistfor each load v1.f -> v3 {

    for each a in pt(v1) {add edge a.f -> v3 to assignment graphadd node a.f to worklist

    }}for each store v3 -> v1.f {

    ... (as above)}

    }

  • Comparison with 0CFA

    Field-sensitive subset-based points-to analysis:

    pt : (Var ∪ (Obj × Field)) →℘(Obj)

    0CFA:

    ⇒ Σ = Call × (Var→℘(Lam))

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Comparison with 0CFA

    p

    L1: new

    p

    L1: λ

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    pL1

    pL1

  • Set implementation• hash: Using java.util.HashSet

    • array: Sorted array, binary search

    • bit vector:

    • hybrid:

    a b d g

    a b c d e f g h i j

    1 1 0 1 0 0 1 0 0 0

    • hybrid:– array for small sets

    – bit vector for large sets

    • sparse bit vector:

    • binary decision diagram:

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    0 (ab) 1 1

    1 (cd) 0 1

    3 (gh) 1 0

  • Set implementationhash slow large

    array slow small

    bit vector fast large

    hybrid fast small

    sparse bit vector fast small

    binary decision diagram depends depends

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Slow vs. fast: up to 100x difference

    Large vs. small: up to 3x difference

    Set implementation is very important.

  • Incremental propagation

    pL1 L2 L3 L4

    q

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}

    pL1 L2 L3 L4

    qL1 L2 L3 L4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}

    • add L5 to pt(p)

    pL1 L2 L3 L4 L5

    qL1 L2 L3 L4

    • add L5 to pt(p)

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}

    • add L5 to pt(p)

    pL1 L2 L3 L4 L5

    qL1 L2 L3 L4 L5

    • add L5 to pt(p)

    • 2nd iteration: propagate {L1, L2, L3, L4, L5}

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    pL1 L2 L3 L4

    Idea: Split sets into old part and new part.

    new

    old

    qnew

    old

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}]

    pL1 L2 L3 L4

    Idea: Split sets into old part and new part.

    new

    old

    qL1 L2 L3 L4new

    old

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}]

    • flush new to old

    p

    Idea: Split sets into old part and new part.

    L1 L2 L3 L4

    new

    old

    q

    L1 L2 L3 L4

    new

    old

    • flush new to old

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}]

    • flush new to old

    pL5

    Idea: Split sets into old part and new part.

    L1 L2 L3 L4

    new

    old

    q

    L1 L2 L3 L4

    new

    old

    • flush new to old

    • add L5 to new part of pt(p)

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}]

    • flush new to old

    pL5

    Idea: Split sets into old part and new part.

    L1 L2 L3 L4

    new

    old

    qL5

    L1 L2 L3 L4

    new

    old

    • flush new to old

    • add L5 to new part of pt(p)

    • 2nd iteration: propagate {L5}

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Incremental propagation

    • 1st iteration: propagate {L1, L2, L3, L4}]

    • flush new to old

    p

    Idea: Split sets into old part and new part.

    L1 L2 L3 L4 L5

    new

    old

    q

    L1 L2 L3 L4 L5

    new

    old

    • flush new to old

    • add L5 to new part of pt(p)

    • 2nd iteration: propagate {L5}

    • flush new to old

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Type filteringA x, z;B y;

    A: x = new A();B: y = new B();

    y = (B) x;z = y;

    x

    A

    y

    B

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Inheritance hierarchy:

    A

    B C

    z

  • Type filtering: none A x, z;B y;

    A: x = new A();B: y = new B();

    y = (B) x;z = y;

    x

    A

    y

    B

    A B

    A

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Inheritance hierarchy:

    A

    B C

    zA B

  • Type filtering: after analysisA x, z;B y;

    A: x = new A();B: y = new B();

    y = (B) x;z = y;

    x

    A

    y

    B

    A B

    A

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Inheritance hierarchy:

    A

    B C

    zA B

  • Type filtering: during analysisA x, z;B y;

    A: x = new A();B: y = new B();

    y = (B) x;z = y;

    x

    A

    y

    B

    A B

    A

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Inheritance hierarchy:

    A

    B C

    zA B

  • Type filtering: effect on heap nodesclass A {Object fA;

    }class B {Object fB;

    }class C {Object fC;

    }A: a = new A();

    A.fA B.fA C.fA

    A.fB B.fB C.fB

    A.fC B.fC C.fC

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    A: a = new A();B: b = new B();C: c = new C();

  • Type filtering

    • Ignoring types yields many large points-to sets.

    • Filtering after propagation is almost as precise as

    during propagation.

    • Filtering during propagation is both most precise

    and most efficient.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    ignore slow imprecise

    after propagation slow precise

    during propagation fast precise

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Field reference representation

    p.f

    r

    L1.f

    pL1

    Field-sensitive

    p.f

    r

    *.f

    pL1

    Field-based

    p.f

    r

    L1.*

    pL1

    Field-insensitive

    Idea: merge yellow nodes with same abstract object (resp. same field).

    L2.f L2.*

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L1.f *.fL1.*

    r

    *.f

    L2.f

    L1.g

    L2.*

    *.g

  • Example (field-based)

    L2

    p

    L1static void foo() {L1: p = new O();

    q = p;L2: r = new O();

    p.f = r;t = bar( q );

    }

    static O bar( O s ) { return s.f;

    L1

    qL1

    rL2

    *.fL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    return s.f;}

    sL1

    tL2

  • Overall (field-based) algorithm

    merge each SCC in assignment graph into a single nodetopologically sort resulting DAGfor each node v1 in topological order {for each edge v1 -> v2 {

    propagate pt(v1) into pt(v2)}

    }

    Each edge is processed only once.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Each edge is processed only once.

    Worst-case O(n^2).

    Also, worst-case is linear in size of output.

    In contrast, field-sensitive algorithm is O(n^3).

  • Example of precision lossA x, y, z;B u, v, w;

    A1: x = new A();y = x;

    A2: z = new A();B: u = new B();

    x.f = u;v = y.f;w = z.f;

    B

    x

    x.f

    A1

    A1

    yA1

    uB

    y.f

    A1.f

    z

    A2

    A2

    z.f

    A2.fB

    B

    uB

    *.fB

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    y.f

    v

    z.f

    wB

    v wB B

    Field-sensitive Field-based

  • Field sensitivity summary

    Java C

    field-insensitive sound

    slow

    imprecise

    sound

    slow

    imprecise

    field-based sound

    fast

    unsound

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    fast

    imprecise

    field-sensitive sound

    slowest

    precise

    sound

    slowest

    precise

  • Comparison: field-based PTA vs. 0CFA

    Field-based subset-based points-to analysis:

    pt : Var→℘(Obj)

    0CFA:

    ⇒ Σ = Call × (Var→℘(Lam))

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • DirectionalityObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    L1

    zL1 L2

    yL2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Subset-based analysis

    aka Inclusion-based analysis

    aka Andersen's analysis

  • DirectionalityObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    L1 L2

    zL1 L2

    yL1 L2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Equality-based analysis

    aka Unification-based analysis

    aka Steensgaard's analysis

  • Implementation of unificationObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    L1

    z

    yL2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Step 1: Process allocation edges

    Equality-based analysis

    aka Unification-based analysis

    aka Steensgaard's analysis

  • Implementation of unificationObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    L1

    z

    yL2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Step 1: Process allocation edges

    Step 2: Repeatedly unify nodes

    connected by assignments

    Running time: almost linear

    Equality-based analysis

    aka Unification-based analysis

    aka Steensgaard's analysis

  • Implementation of unificationObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    L1

    z

    yL2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Step 1: Process allocation edges

    Step 2: Repeatedly unify nodes

    connected by assignments.

    Equality-based analysis

    aka Unification-based analysis

    aka Steensgaard's analysis

  • Implementation of unificationObject x, y, z;

    L1: x = new Object();L2: y = new Object();

    if(*) {z = x;

    } else {z = y;

    }

    x

    L1

    z

    y

    L2

    L2L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Step 1: Process allocation edges

    Step 2: Repeatedly unify nodes

    connected by assignments.

    Also unify nodes pointed-to by

    same node.

    Running time: almost linear

    Equality-based analysis

    aka Unification-based analysis

    aka Steensgaard's analysis

  • A C examplea = &b;b = &c;a = &d;d = &e;

    ab&b

    d&d

    c&c

    e&e

    Subset-based analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • A C examplea = &b;b = &c;a = &d;d = &e;

    ab&b

    d&d

    c&c

    e&e

    Subset-based analysis

    &b

    &d

    &c

    &e

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • ba

    A C examplea = &b;b = &c;a = &d;d = &e;

    &b

    d&d

    c&c

    e&e

    Equality-based analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • ba

    A C examplea = &b;b = &c;a = &d;d = &e;

    &b

    d&d

    c&c

    e&e&b

    Equality-based analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • ba

    A C examplea = &b;b = &c;a = &d;d = &e;

    &b

    d&d

    c&c

    e&e&b

    &c

    Equality-based analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • ba

    A C examplea = &b;b = &c;a = &d;d = &e;

    &b

    d&d

    c&c

    e&e

    Equality-based analysis

    &b

    &d

    &c

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Invariant: each node points to at most one other node.

  • ba

    A C examplea = &b;b = &c;a = &d;d = &e;

    &b

    d&d

    c&c

    e&e&b

    &d

    &c

    &e

    Equality-based analysis

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2

    x

    L1

    L1

    yL2

    A().this B().this

    Object().this

    L1 L2

    L2L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Subset-based analysis

  • A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2

    x

    L1

    L1

    yL2

    A().this B().this

    Object().this

    L1 L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Equality-based analysis

  • A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2

    x

    L1

    L1

    yL2

    A().this B().this

    Object().this

    L2L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Equality-based analysis

  • A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2

    x

    L1

    L1

    yL2

    A().this B().this

    Object().this

    L2L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Equality-based analysis

  • A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2

    x

    L1

    L1

    yL2

    A().this B().this

    Object().this

    L2L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Equality-based analysis

  • yx

    A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2L1

    L1 L2

    A().this B().this

    Object().this

    L2L1L2 L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Equality-based analysis

  • yx

    A problem with unification and OOPclass A extends Object {public A() {

    super();}

    }class B extends Object {...

    }L1: x = new A();L2: y = new B();

    L2L1

    L1 L2

    A().this B().this

    Object().this

    L2L1L2 L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new B();

    Every pointer points to every object!

    ... but context sensitivity will fix this.

    Equality-based analysis

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • How big is the call graph?

    • Number of methods actually executed: ??

    public class Hello {public static final void main(String[] args) {

    System.out.println("Hello");}

    }

    • Number of methods actually executed: ??

    • Number of methods in static call graph: ??

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • How big is the call graph?

    • Number of methods actually executed: 498

    public class Hello {public static final void main(String[] args) {

    System.out.println("Hello");}

    }

    • Number of methods actually executed: 498

    • Number of methods in static call graph: ??

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • How big is the call graph?

    • Number of methods actually executed: 498

    public class Hello {public static final void main(String[] args) {

    System.out.println("Hello");}

    }

    • Number of methods actually executed: 498

    • Number of methods in static call graph: 3204

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Call graph construction

    To determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    reachable methods call graph edges

    call graph edges points-to sets

  • Call graph construction

    To determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    reachable methods call graph edges

    call graph edges points-to sets

  • Call graph construction

    To determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    reachable methods call graph edges

    call graph edges points-to sets

  • Call graph construction

    To determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    reachable methods call graph edges

    call graph edges points-to sets

  • Call graph construction

    To determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    reachable methods call graph edges

    call graph edges points-to sets

  • Call graph constructionTo determine ... ... we need to know

    points-to sets pointer assignment edges

    pointer assignment edges reachable methods

    call graph edges

    reachable methods call graph edges

    call graph edges points-to setsreachable call graph

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    points-to

    sets

    pointer

    assignment

    edges

    reachable

    methods

    call graph

    edges

  • Ahead of time call graph construction1. Assume every pointer can point to

    any object compatible with its

    declared type.

    2. Explore call graph using this

    assumption, listing reachable

    methods (Class Hierarchy Analysis).

    3. Generate pointer assignment graph

    using resulting call edges and

    reachable

    methods

    call graph

    edges

    fully imprecise

    points-to sets

    using resulting call edges and

    reachable methods.

    4. Propagate points-to sets along

    pointer assignment graph.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    points-to

    sets

    pointer

    assignment

    edges

  • Ahead of time call graph construction1. Assume every pointer can point to

    any object compatible with its

    declared type.

    2. Explore call graph using this

    assumption, listing reachable

    methods (Class Hierarchy Analysis).

    3. Generate pointer assignment graph

    using resulting call edges and using resulting call edges and

    reachable methods.

    4. Propagate points-to sets along

    pointer assignment graph.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    •no iteration

    •very imprecise due to

    many reachable

    methods

  • On-the-fly call graph construction1. Start with only initial reachable

    methods, no call edges, no pointer

    assignment edges, and no points-to

    sets.

    2. Iteratively generate pointer

    assignment edges, points-to sets, call

    edges, and reachable methods

    implied by current information. reachable call graph implied by current information.

    3. Stop when overall fixed point is

    reached.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    points-to

    sets

    pointer

    assignment

    edges

    reachable

    methods

    call graph

    edges

  • On-the-fly call graph construction1. Start with only initial reachable

    methods, no call edges, no pointer

    assignment edges, and no points-to

    sets.

    2. Iteratively generate pointer

    assignment edges, points-to sets, call

    edges, and reachable methods

    implied by current information.implied by current information.

    3. Stop when overall fixed point is

    reached.

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    •requires iteration

    •slower

    •more complicated

    •much more precise

    due to fewer reachable

    methods

  • Partly on-the-fly call graph construction1. Assume all methods are reachable.

    2. Generate pointer assignment edges for all methods.

    3. Iteratively propagate points-to sets, add call edges, and generate

    pointer assignment edges for new call edges.

    reachable call graph all methods

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    points-to

    sets

    pointer

    assignment

    edges

    reachable

    methods

    call graph

    edges

    all methods

    reachable

  • Partly on-the-fly call graph construction1. Assume all methods are reachable.

    2. Generate pointer assignment edges for all methods.

    3. Iteratively propagate points-to sets, add call edges, and generate

    pointer assignment edges for new call edges.

    • requires iteration

    • speed is in between ahead-of-time and on-the-fly

    • complexity is in between...

    • still imprecise due to many reachable methods

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Context sensitivity motivationObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();

    Object c = id(a);Object d = id(b);

    }

    a

    L1

    L1

    id.oL1

    b

    L2

    L2

    L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.retL1 L2

    dL1 L2

    cL1 L2

  • Call strings approach (aka cloning)Object id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    L1

    L3:id.oL1

    b

    L2

    L2

    L4:id.oL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    L3:id.retL1

    dL2

    cL1

    L4:id.retL2

  • Summary-based approachObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();

    Object c = id(a);Object d = id(b);

    }

    a

    L1

    L1

    b

    L2

    L2

    id()

    summary

    cL1

    dL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }L1 L2

    Challenge: how to design a summary that

    •precisely models all effects of id() (and its transitive callees)

    •is cheap to compute and represent

    •is cheap to instantiate

  • Comparison with 1CFA

    Field-sensitive subset-based 1-call-site-sensitive

    points-to analysis:

    pt : (Call × Var ∪ (Obj × Field)) →℘(Obj)

    1CFA:

    Σ =

    Call×(Var→Addr)×(Addr→℘(Lam × Var→Addr))×Call

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Limitation of single call site contextObject id(Object o) {L5: return id2(o);}

    Object id2(Object o) {return o;

    }

    void f() {L1: Object a = new Object();

    a

    L1

    L1

    L3:id.oL1

    b

    L2

    L2

    L4:id.oL2

    L5:id2.oL1 L2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    L3:id.ret

    dc

    L4:id.ret

    L1

    L5:id2.ret

    L2

    L1 L2

    L1 L2 L1 L2

    L1 L2L1 L2

  • 2-call-site context sensitivityObject id(Object o) {L5: return id2(o);}

    Object id2(Object o) {return o;

    }

    void f() {L1: Object a = new Object();

    a

    L1

    L1

    L3:id.oL1

    b

    L2

    L2

    L4:id.oL2

    L3.L5:id2.oL1

    L4.L5:id2.oL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    L3:id.ret

    dc

    L4:id.ret

    L1

    L3.L5:id2.retL1

    L1 L2

    L2L1

    L4.L5:id2.ret

    L2

    L2

  • k-call-site context sensitivity

    • k-Call-Site: keep last k call sites

    – space/time complexity exponential in k

    • Full Call String: keep full string of call sites

    – exponential number of call strings

    – recursion: infinite number of call strings

    • exclude any call site that is in a recursive cycle from string• exclude any call site that is in a recursive cycle from string

    • still exponential

    • what if many call sites are in recursive cycle ?

    • C: call graph is almost DAG => works well

    • Java: half of call graph is one big SCC => no precision

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Context-sensitive heap abstractionObject alloc() {L1: return new Object();}

    void f() {Object a = alloc();Object b = alloc();

    }a

    L1

    L1

    bL1

    alloc.retL1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Context-sensitive heap abstractionObject alloc() {L1: return new Object();}

    void f() {L2: Object a = alloc();L3: Object b = alloc();}

    a

    L1

    L1

    bL1

    L2:alloc.retL1

    L3:alloc.retL1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Context-sensitive heap abstractionObject alloc() {L1: return new Object();}

    void f() {L2: Object a = alloc();L3: Object b = alloc();}

    a

    L2:L1

    b

    L2:alloc.retL2:L1

    L3:alloc.ret

    L3:L1

    L3:L1

    L3:L1L2:L1

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Comparison with 1CFA

    Field-sensitive subset-based 1-call-site-sensitive

    points-to analysis with context-sensitive heap

    abstraction:

    pt :

    (Call × Var ∪ (Call × Obj × Field)) →℘(Call × Obj)

    1CFA:

    Σ =

    Call×(Var→Addr)×(Addr→℘(Lam × Var→Addr))×Call

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Object-sensitive analysisclass Container {

    private Item item;public void set(Item i) {

    this.item = i;}

    }

    L1: Container c1 = new Container();L2: Item i1 = new Item();L3: c1.set(i1);

    c1

    L1

    L1

    set.thisL1

    i1

    L2

    L2

    set.iL2

    set.this.itemL1.itemL2

    c2

    L4

    L4

    L4

    i2

    L5

    L5

    L5

    L4.itemL5 L2 L5

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L3: c1.set(i1);

    L4: Container c2 = new Container();L5: Item i2 = new Item();L6: c2.set(i2);

    L2 L5 L2 L5

    [Milanova&Ryder]

  • Object-sensitive analysisclass Container {

    private Item item;public void set(Item i) {

    this.item = i;}

    }

    L1: Container c1 = new Container();L2: Item i1 = new Item();L3: c1.set(i1);

    c1

    L1

    L1

    L1:set.thisL1

    i1

    L2

    L2

    L1:set.iL2

    L1:set.this.itemL1.itemL2

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L3: c1.set(i1);

    L4: Container c2 = new Container();L5: Item i2 = new Item();L6: c2.set(i2);

    L2

    c2

    L4

    L4

    L4:set.thisL4

    i2

    L5

    L5

    L4:set.iL5

    L4:set.this.itemL4.itemL5[Milanova&Ryder]

  • Object-sensitive analysis

    • Like call-site context-sensitive analysis:

    – can use strings of abstract objects

    – can make heap abstraction (object-)context-sensitive

    • Call-site CS and object-sensitive CS have

    incomparable precision (neither is theoretically

    more precise)more precise)

    • In practice, for OO programs, object sensitivity

    more precise than call-site sensitivity for the same

    context string length

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    [Milanova&Ryder]

  • Effect of context-sensitivity in Java

    • For call graph construction:

    – context sensitivity has some effect

    • For cast safety analysis:

    – context sensitivity substantially improves precision

    – object sensitivity more precise than call sites

    – context sensitive heap abstraction further improves – context sensitive heap abstraction further improves

    precision

    – context strings longer than 1 add little precision

    –∞-call-site ignoring cycles less precise than 1-call-site

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    xL1

    yL2

    zL3

    wL4

    x.f

    v

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    xL1

    yL2

    zL3

    wL4

    x.fL1.f

    v

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    xL1

    yL2

    zL3

    wL4

    x.fL1.fa b c

    a.f

    ret

    v

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    a b c

    a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    xL1

    yL2

    zL3

    wL4

    x.fL1.f

    v

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    x y

    a b c

    a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    L1 L2

    zL3

    wL4

    x.fL1.f

    v

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    x y

    a b c

    a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    L1 L2

    zL3

    wL4

    x.fL1.f

    v

    L2.f

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    x y

    a b c

    a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    L1 L2

    zL3

    wL4

    x.fL1.f

    v

    L2.f

    [Lattner&Adve]

  • Context-sensitive equality-based analysisf(a, b, c) {

    d = a;d = b;e = c;e = a.f;return e;

    }

    L1: x = new A();L2: y = new A();

    a b c

    d e a.f

    ret

    L1 L2 L3 L4

    z wx y

    a b c

    a.f

    ret

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    L2: y = new A();L3: z = new A();L4: w = new A();

    x.f = w;v = f(x, y, z);

    L1 L2 L3 L4

    x.fL1.f

    v

    L2.f

    [Lattner&Adve]

  • Refinement demand-driven analysisObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    id.o

    b

    L2

    (3 (4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.ret

    dc

    )3 )4

    [Sridharan&Bodík]

  • Refinement demand-driven analysisObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    id.o

    b

    L2

    (3 (4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.ret

    dc

    )3 )4

    L1 ∈ pt(d) ⇔∃ balanced-parens path from L1 to d

    [Sridharan&Bodík]

  • Refinement demand-driven analysisObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    id.o

    b

    L2

    (3 (4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.ret

    dc

    )3 )4

    L1 ∈ pt(d) ⇔∃ balanced-parens path from L1 to d

    But there are lots of paths to search.

    [Sridharan&Bodík]

  • Refinement demand-driven analysisObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    id.o

    b

    L2

    (3 (4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.ret

    dc

    )3 )4

    L1 ∈ pt(d) ⇔∃ balanced-parens path from L1 to d

    But there are lots of paths to search.

    Add shortcut edges to graph.

    No path even with shortcuts ⇒ no path at all.

    [Sridharan&Bodík]

  • Refinement demand-driven analysisObject id(Object o) {

    return o;}

    void f() {L1: Object a = new Object();L2: Object b = new Object();L3: Object c = id(a);L4: Object d = id(b);}

    a

    L1

    id.o

    b

    L2

    (3 (4

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    }

    id.ret

    dc

    )3 )4

    L1 ∈ pt(d) ⇔∃ balanced-parens path from L1 to d

    But there are lots of paths to search.

    Add shortcut edges to graph.

    No path even with shortcuts ⇒ no path at all.

    On balanced paths found, gradually remove shortcuts.

    [Sridharan&Bodík]

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

  • Flow sensitivity

    L1: a = new Object(); L2: b = new Object();L3: c = new Object();L4: a = b;L5: b = a;L6: c = b;

    L1 L2 L3

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    a b cL3L2L1L2L1L2L1

  • Flow sensitivity

    L1: a = new Object(); a -> {L1}L2: b = new Object(); a -> {L1}, b -> {L2}L3: c = new Object(); a -> {L1}, b -> {L2}, c -> {L3}L4: a = b; a -> {L2}, b -> {L2}, c -> {L3}L5: b = a; a -> {L2}, b -> {L2}, c -> {L3}L6: c = b; a -> {L2}, b -> {L2}, c -> {L2}

    Strong updates: overwrite existing pt-set contents

    L1 L2 L3

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Strong updates: overwrite existing pt-set contentsa b c

    L3L2L1L2L1L2L1

  • Flow sensitivity using SSA forma b c

    L1: a = new Object(); a -> {L1} 1L2: b = new Object(); a -> {L1}, b -> {L2} 1 2L3: c = new Object(); a -> {L1}, b -> {L2}, c -> {L3} 1 2 3L4: a = b; a -> {L2}, b -> {L2}, c -> {L3} 4 2 3L5: b = a; a -> {L2}, b -> {L2}, c -> {L3} 4 5 3L6: c = b; a -> {L2}, b -> {L2}, c -> {L2} 4 5 6

    L1 L2 L3Currently live reaching

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    a b cL3L2L1L2L1L2L1

    Currently live reaching

    definition of each

    variable.

  • Flow sensitivity using SSA forma b c

    L1: a1= new Object(); a1-> {L1} 1L2: b2= new Object(); a1-> {L1}, b2-> {L2} 1 2L3: c3= new Object(); a1-> {L1}, b2-> {L2}, c3-> {L3} 1 2 3L4: a4= b2; a4-> {L2}, b2-> {L2}, c3-> {L3} 4 2 3L5: b5= a4; a4-> {L2}, b5-> {L2}, c3-> {L3} 4 5 3L6: c6= b5; a4-> {L2}, b5-> {L2}, c6-> {L2} 4 5 6

    L1 L2 L3L1 L2 L3

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    a1 b2 c3L3L2L1

    a4 b5 c6L2L2 L2

    For local variables, FI analysis on SSA

    form gives same result as FS analysis

    on original program.

    a b cL3L2L1L2L1L2L1

  • Flow sensitivity using SSA forma

    L1: if(*) {L2: a = b; 2L3: } else {L4: a = c; 4L5: } ?L6: ?L7: d = a; ?

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Which definition of a is current at L7?

  • Flow sensitivity using SSA forma

    L1: if(*) {L2: a = b; 2L3: } else {L4: a = c; 4L5: } ?L6: a = φ(a2,a4) 6L7: d = a; 6

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    Which definition of a is current at L7?

    Use φ to create new definition of a.

    pt(a6) = pt(a2) ∪ pt(a4)

  • When does flow sensitivity matter?

    Java C/C++

    local variables no, use SSA form

    address-taken local variables no, don't exist possibly

    global variables unlikely, values usually long-lived

    fields of heap objects no strong updates on heap objects

    unless analysis extended with

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo

    unless analysis extended with

    single-concrete-object abstraction

  • Design decisions for precision/efficiency

    •The abstraction (affects precision and efficiency):–Type filtering

    –Field sensitivity

    –Directionality

    –Call graph construction

    –Context sensitivity

    –Flow sensitivity–Flow sensitivity

    •Algorithm and implementation (affects efficiency)–Propagation algorithm

    –Set implementation

    A Tour of Pointer Analysis| Ondrej Lhotak| University of Waterloo