data representation synthesis pldi’2011, esop’12, pldi’12

Post on 24-Feb-2016

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Data Representation Synthesis PLDI’2011, ESOP’12, PLDI’12. Peter Hawkins, Stanford University Alex Aiken, Stanford University Kathleen Fisher, Tufts Martin Rinard, MIT Mooly Sagiv, TAU. http://theory.stanford.edu/~hawkinsp/. Research Interests. - PowerPoint PPT Presentation

TRANSCRIPT

Data Representation SynthesisPLDI’2011, ESOP’12, PLDI’12

Peter Hawkins, Stanford UniversityAlex Aiken, Stanford University

Kathleen Fisher, TuftsMartin Rinard, MITMooly Sagiv, TAU

http://theory.stanford.edu/~hawkinsp/

Research Interests

• Verifying properties of low level data structure manipulations

• Synthesizing low level data structure manipulations– Composing ADT operations

• Concurrent programming via smart libraries

Motivation

• Sophisticated data structures exist• Increase abstraction level• Simplifies programming– Concurrency

• Integrated into modern PL• But hard to use– Composing several operations

next next next next next

table[0][1][2][3][4][5]

null

index= 2 index = 0 index = 3 index = 1 index= 52 0 3 1 1

thttpd: Web ServerConn

Map

Representation Invariants:1. n: Map. v: Z. table[v] =n n->index=v

2. n: Map. n->rc =|{n' : Conn . n’->file_data = n}|

static void add_map(Map *m){ int i = hash(m); … table[i] = m ; … m->index= i ; … m->rc++;

}

thttptd:mmc.c

Representation Invariants:1. n: Map. v:Z. table[v] =n index[n]=v

2. n:Map. rc[n] =|{n' : Conn . file_data[n’] = n}|

broken

restored

attr = new HashMap();…

Attribute removeAttribute(String name){ Attribute val = null; synchronized(attr) { found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } } return val;}

TOMCAT Motivating ExampleTOMCAT 5.*

Invariant: removeAttribute(name) returns the removed value or null if the name does not exist

attr = new ConcurrentHashMap();…

Attribute removeAttribute(String name){ Attribute val = null; /* synchronized(attr) { */ found = attr.containsKey(name) ; if (found) { val = attr.get(name); attr.remove(name); } /* } */ return val;}

TOMCAT Motivating ExampleTOMCAT 6.*

attr.put(“A”, o);

attr.remove(“A”);

o

Invariant: removeAttribute(name) returns the removed value or null if the name does not exist

Multiple ADTs

Invariant: Every element that added to eden is either in eden or in longterm

public void put(K k, V v) { if (this.eden.size() >= size) { this.longterm.putAll(this.eden); this.eden.clear(); } this.eden.put(k, v);}

filesystem=1s_list

s_files

filesystem=2s_list

s_files

filesystems

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

Filesystem Example

Filesystem Example

Access Patterns

• Find all mounted filesystems

• Find cached files on each filesystem

• Iterate over all used or unused cached files in Least-Recently-Used order

filesystem=1s_list

s_files

filesystem=2s_list

s_files

filesystems

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

11

Desired Properties of Data Representations

• Correctness

• Efficiency for access patterns

• Simple to reason about

• Easy to evolve

Specifying Combination of Shared Data Structures

• Separation Logic• First order logic+ reachability• Monadic 2nd order logic on trees• Graph grammars• …

fs file inuse

1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

filesystems filesystem=1s_list

s_files

filesystem=2s_list

s_files

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

{fs, file} {inuse}

fs file inuse

1 17 F

1 34 T

1 42 F

2 5 T

2 5 F

Filesystem Example

Filesystem ExampleContainers: View 1

fs:1 fs:2

file:14file:6

file:2 file:7 file:5

filesystem=1s_list

s_files

filesystem=2s_list

s_files

filesystems

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

Filesystem ExampleContainers: View 2

filesystem=1s_list

s_files

filesystem=2s_list

s_files

filesystems

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

fs: 2, file:7fs: 1,

file:14

fs: 2, file:5

inuse:T inuse:F

fs: 1, file:2

Filesystem ExampleContainers: Both Views

fs:1 fs:2

file:14file:6

file:2 file:7 file:5

filesystem=1s_list

s_files

filesystem=2s_list

s_files

filesystems

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_list

file_in_use

file_unused

fs: 2, file:7fs: 1,

file14

fs: 2, file:5

inuse:T inuse:F

fs file inuse

{fs, file} { inuse}

Decomposing Relations

• High-level shape descriptor• A rooted directed acyclic graph• The root represents the whole

relation• The sub-graphs rooted at each

node represent sub-relations• Edges are labeled with columns• Each edge maps a relation into a

sub-relation• Different types of edges for

different primitive data-structures

• Shared nodes correspond to shared sub-relations

fs file inuse

{fs, file} { inuse} inuse

fs, file

fs inuse

file

list

list

array

list

Filesystem Examplefs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F

inuse

fs, file

fs inuse

filefile inuse14 F

6 T

2 F

file inuse7 T

5 F

inuseF

inuseT

inuseF

inuseT

inuseF

fs:1 fs:2

file:14 file:6 file:2 file:7 file:5

Memory Decomposition(Left)

inuse

fs, file

fs inuse

file

inuse:F inuse:T inuse:F Inuse:T Inuse:F

fs:1 fs:2

file:14 file:6 file:2 file:7 file:5

Filesystem Examplefs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F

fs file2 71 6

fs file1 142 51 2

fs:2file:7

inuseT

inuseT

inuseF

inuseF

inuseF

fs:1file:6

fs:1file:14

fs:2file:5

fs:1file:2

inuse:T inuse:F

inuse

fs, file

fs inuse

file

Memory Decomposition(Right)

fs:2file:7

fs:1file:6

fs:1file:14

fs:2file:5

fs:1file:2

inuse:T inuse:F

inuse:F inuse:T inuse:F Inuse:T Inuse:F

inuse

fs, file

fs inuse

file

Decomposition Instance

fs:2file:7

fs:1file:6

fs:1file:14

fs:2file:5

fs:1file:2

inuse:T inuse:F

inuse:F inuse:T inuse:F Inuse:T Inuse:F

fs file inuse1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

fs:1

file:14 file:6

file:2

file:7file:5

fs:2

fs file inuse

{fs, file} { inuse} inuse

fs, file

fs inuse

file

Shape Descriptor

fs:2file:7

fs:1file:6

fs:1file:14

fs:2file:5

fs:1file:2

inuse:T inuse:F

inuse:F inuse:T inuse:F Inuse:T Inuse:F

fs file inuse1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

fs:1

file:14 file:6

file:2

file:7file:5

fs:2

fs file inuse

{fs, file} { inuse} inuse

fs, file

fs inuse

file

Memory State

fs:2file:7

fs:1file:6

fs:1file:14

fs:2file:5

fs:1file:2

inuse:T inuse:F

inuse:F inuse:T inuse:F Inuse:T Inuse:F

fs file inuse1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

fs:1

file:14 file:6

file:2

file:7file:5

fs file inuse

{fs, file} { inuse} inuse

fs, file

fs inuse

filefs:2

list

list

array

list

s_list

f_fs_list f_fs_list f_fs_list

f_list f_list

f_list

Memory State

fs file inuse1 14 F2 7 T2 5 F1 6 T1 2 F

fs file inuse

{fs, file} { inuse} inuse

fs, file

fs inuse

file

list

list

array

list

filesystems

filesystem=1s_list

s_files

filesystem=2s_list

s_files

file=14f_list

f_fs_list

file=6f_list

f_fs_list

file=2f_list

f_fs_list

file=7f_list

f_fs_list

file=5f_list

f_fs_listfile_in_use

file_unused

Adequacy

A decomposition is adequate if it can represent every possible relation matching a relational specification

Adequacy

enforces sufficient conditions for adequacy

Not every decomposition is a good representation of a relation

Adequacy of Decompositions

• All columns are represented• Nodes are consistent with functional

dependencies– Columns bound to paths leading to a common

node must functionally determine each other

Respect Functional Dependencies

file

inuse

{file} {inuse}

Respect Functional Dependencies

file,fs

inuse

{file, fs} {inuse}

Adequacy and Sharing

fs, file

fs inuse

file

inuseColumns bound on a path to an object x must functionallydetermine columns bound on any other path to x

{fs, file}{inuse, fs, file}

Adequacy and Sharing

fs

fs inuse

file

inuseColumns bound on a path to an object x must functionallydetermine columns bound on any other path to x

{fs, file} {inuse, fs}

Decompositions and Aliasing• A decomposition is an

upper bound on the set of possible aliases

• Example: there are exactly two paths to any instance of node w

The RelC Compiler[PLDI’11]

RelationalSpecification

Programmer

Decomposition

Data structure designer

RelC C++

The RelC Compilerfs fileinuse{fs, file} {inuse}

foreach <fs, file, inuse> filesystems s.t. fs= 5 do …

RelC C++

inuse

fs, file

fs

list inuse

filelist

array

list

Relational Specification

Graph decomposition

Query Plans

foreach <fs, file, inuse> filesystems if inuse=T do …

fs, file

fs

list inuse

file

inuse

list

array

list

scan

scanCost proportional to the number of files

Query Plans

foreach <fs, file, inuse> filesystems if inuse=T do …

fs, file

fs

list inuse

file

inuse

list

array

list

access

scan

Cost proportional to the number of files in use

Removal and graph cutsremove <fs:1>

fs

list

file

inuse

list

array

list

fs file inuse1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

filesystems fs:1s_list

s_files

fs:2s_list

s_files

file:14f_list

f_fs_list

file:6f_list

f_fs_list

f_listf_fs_list

file:7f_list

f_fs_list

file:5f_list

f_fs_list

inuse:T

inuse:F

file:2

Removal and graph cutsremove <fs:1>

fs file inuse1 14 F

2 7 T

2 5 F

1 6 T

1 2 F

filesystems fs:2s_list

s_files

file:7f_list

f_fs_list

file:5f_list

f_fs_list

inuse:T

inuse:F

fs

list

file

inuse

list

array

list

Abstraction Theorem

• If the programmer obeys the relational specification and the decomposition is adequate and if the individual containers are correct

• Then the generated low-level code maintains the relational abstraction

relation relationremove <fs:1>

memorystate

memorystate

low level code remove <fs:1>

40

Filesystem ExampleConcurrent

^

•Lock-based concurrent code is difficult to write and difficult to reason about•Hard to choose the

correct granularity of locking•Hard to use concurrent

containers correctly

41

Granularity of Locking

Fine-grained locking

Coarse-grained locking

More complex

Easier to reason aboutEasier to change

More scalable

Lower overheadLess complex

42

Concurrent Containers• Concurrent containers are hard to write• Excellent concurrent containers exist (e.g.,

Java’s ConcurrentHashMap and ConcurrentSkipListMap)

• Composing concurrent containers is tricky– It is not usually safe to update containers in paralel– Lock association is tricky

• In practice programmers often* use concurrent containers incorrectly

* 44% of the time [Shacham et al., 2011]

43

Concurrent Data Representation Synthesis[PLDI’12]

RelScala

Compiler

Concurrent Data Structures,Atomic Transactions

Relational Specification

Concurrent Decomposition

44

Concurrent Decompositions

• Concurrent decompositions describe how to represent a relation using concurrent containers and locks

• Concurrent decompositions specify: – the choice of containers– the number and placement of locks– which locks protect which data

= ConcurrentHashMap

= TreeMap

45

Concurrent Interface

Interface consisting of atomic operations:

46

Goal: Serializability

Thread 1

Thread 2

Time

Every schedule of operations is equivalent to some serial schedule

47

Goal: Serializability

Time

Thread 1

Thread 2

Every schedule of operations is equivalent to some serial schedule

48

Two-Phase Locking

Two phase locking protocol:• Well-locked: To perform a read or write, a thread must

hold the corresponding lock• Two-phase: All lock acquisitions must precede all lock

releases

Attach a lock to each piece of data

Theorem [Eswaran et al., 1976]: Well-locked, two-phase transactions are serializable

49

Two Phase Locking

Attach a lock to every edge

Problem 2: Too many locks

Decomposition Decomposition Instance

We’re done!

Problem 1: Can’t attach locks to container entries

Lock Placements

1. Attach locks to nodes

Decomposition Decomposition Instance

51

Coarse-Grained LockingDecomposition Decomposition Instance

52

Finer-Grained LockingDecomposition Decomposition Instance

53

Lock StripingDecomposition Decomposition Instance

54

Lock Placements: DominationDecomposition Decomposition Instance

Locks must dominate the edges they protect

55

Lock Placements: Path-ClosureAll edges on a path between an edge and its lock must share the same lock

56

Decompositions and Aliasing• A decomposition is an upper

bound on the set of possible aliases

• Example: there are exactly two paths to any instance of node w

• We can reason about the relationship between locks and heap cells soundly

57

Lock Ordering

Prevent deadlock via a topological order on locks

58

Queries and Deadlock

2. lookup(tv)

1. acquire(t)

3. acquire(v)

4. scan(vw)

Query plans must acquire the correct locks in the correct order

Example: find files on a particular filesystem

Autotuner

• Given a fixed set of primitive types– list, circular list, doubly-linked list, array, map, …

• A workload• Exhaustively enumerate all the adequate

decompositions up to certain size• The compiler can automatically pick the best

representation for the workload

Directed Graph Example (DFS)• Columns

src dst weight• Functional Dependencies

– {src, dst} {weight}

• Primitive data types– map, list

src

dst

weight

map

list

src

dst

dst

src

weight

map

list list

mapdst

src

weightm

aplist

dst

weight

map srcmap

weight

dst

listsrclist

61

Concurrent SynthesisFind optimal combination of

Decomposition Shape ContainerData Structures

ConcurrentHashMap

ConcurrentSkipListMap

CopyOnWriteArrayList

Array

HashMap

TreeMap

LinkedList

Lock Implementations

ReentrantReadWriteLock

ReentrantLock

Lock Striping Factors

Lock Placement

62

Concurrent Graph Benchmark

• Start with an empty graph• Each thread performs 5 x 105 random

operations• Distribution of operations a-b-c-d (a% find

successors, b% find predecessors, c% insert edge, d% remove edge)

• Plot throughput with varying number of threads

Based on Herlihy’s benchmark of concurrent maps

Black = handwritten,isomorphic to blue

Key

= ConcurrentHashMap= HashMap

...

63

Results: 35-35-20-1035% find successor, 35% find predecessor, 20% insert edge, 10% remove edge

(Some) Related Projects

• Relational synthesis: [Cohen & Campbell 1993], [Batory & Thomas 1996], [Smaragdakis & Batory 1997], [Batory et al. 2000] [Manevich, 2012] …

• Two-phase locking and Predicate Locking [Eswaran et al., 1976], Tree and DAG locking protocols [Attiya et al., 2010], Domination Locking [Golan-Gueta et al., 2011]

• Lock Inference for Atomic Sections: [McCloskey et al.,2006], [Hicks, 2006], [Emmi, 2007]

Summary

• Decompositions describe how to represent a relation using concurrent containers

• Lock placements capture different locking strategies

• Synthesis explores the combined space of decompositions and lock placements to find the best possible concurrent data structure implementations

The Big Idea

• Simpler code• Correctness by construction• Find better combinations of data structures

and locking

Benefits

Relational Specification

Low Level Concurrent Code

Transactional Objects with Foresight

Guy GuetaG. Ramalingam

Mooly SagivEran Yahav

Motivation

• Modern libraries provide atomic ADTs• Hidden synchronization complexity• Not composable– Sequences of operations are not necessarily

atomic

Transactional Objects with Foresight

• The client declares the intended use of an API via temporal specifications (foresight)

• The library utilizes the specification– Synchronize between operations which do not

serialize with foresight• Foresight can be automatically inferred with

sequential static analysis

Example: Positive CounterOperation A@mayUse(Inc);Inc()Inc()@mayUse()

Operation B@mayUse(Dec);Dec()Dec()@mayUse()

• Composite operations are not atomic• even when Inc/Dec are atomic

• Enforce atomicity using future information • Future information can be utilized for effective

synchronization

Possible Executions (initial value=0)

Main Results

• Define the notion of ADTs that utilize foresight information– Show that they are strict generalization of existing locking

mechanisms• Create a Java Map ADT which utilizes foresight • Static analysis for inferring conservative foresight

information• Performance is comparable to manual synchronization• Can be used for real composite operations

The RelC Compilerfs fileinuse{fs, file} {inuse}

foreach <fs, file, inuse> filesystems s.t. fs= 5 do …

RelC C++

inuse

fs, file

fs

list inuse

filelist

array

list

Relational Specification

Graph decomposition

top related