concurrent revisions: a deterministic concurrency model
Post on 09-Feb-2016
31 Views
Preview:
DESCRIPTION
TRANSCRIPT
Concurrent Revisions:A deterministic concurrency model.
Daan Leijen, Alexandro Baldassin, and Sebastian Burckhardt
Microsoft Research(OOPSLA 2010)
The concurrency elephant• Task/Data Parallel: TPL, X10, Cilk, StreamIt,
Cuda, OpenMP, etc.• Concurrent: Thread, Locks, Promises,
Transactions, etc.• Our focus: Concurrent interactive applications
with large shared data structures.
Application = Shared Data and Tasks
Shared Data
ReaderMutatorReader
Example: Office application• Save the document• React to keyboard input by the user• Perform a spellcheck in the background• Exchange updates with remote users
Mutator Reader
Spacewars!
About 15k lines of C# code, using DirectX. The original game is sequential
Example 1: read-write conflict Render task reads position of all game objects Physics task updates position of all game objects
=> Render task needs to see consistent snapshot
Example 2: write-write conflict Physics task updates position of all game objects Network task updates position of some objects
=> Network has priority over physics updates
Examples from SpaceWars Game
Conflicting tasks can not efficiently execute in parallel. pessimistic concurrency control (i.e. locks)• use locks to avoid parallelism where
there are (real or potential) conflicts optimistic concurrency control (i.e. TM)• speculate on absence of conflicts
rollback if there are real conflicts
either way: true conflicts kill parallelism.
Conventional Concurrency Control
Our Proposed Programming Model: Revisions and Isolation Types
• Deterministic Conflict Resolution, never roll-back• No restrictions on tasks (can be long-running, do I/O)• Full concurrent reading and writing of shared data• Clean semantics (see technical report)• Fast and space-efficient runtime implementation
RevisionA logical unit of
work that is forked and joined
Isolation TypeA type which implements
automatic copying/merging of versions on write-write conflict
What’s new
• Isolation: side effects are only visible when the revision is joined.
• Deterministic execution!
int x = 0;Task t = fork { x = 1;}assert(x==0 || x==1);join t;assert(x==1);
Versioned<int> x = 0;Revision r = rfork { x = 1;}assert(x==0);join r;assert(x==1);
Traditional Task Concurrent Revisions
Isolation types:declares shared data
fork revision:forks off a private copy
of the shared state
join revision:waits for the revision to
terminate and writes back changes into the main revision
isolation:Concurrent modifications
are not seen by others
No isolation:We see either 0 or 1
depending on the schedule
Puzzle time…int x = 0;int y = 0;
Task t = fork { if (x==0) y++;}if (y==0) x++;
join t;
Hard to read: let’s use a diagram instead…
Sequential consistencyint x = 0int y = 0
if (y==0) x++; if (x==0) y++;
assert( (x==0 && y==1) || (x==1 && y==0) || (x==1 && y==1));What are the possible values for x and y?
??
Transactional memoryint x = 0int y = 0
atomic { if (y==0) x++; }
atomic { if (x==0) y++; }
assert( (x==0 && y==1) || (x==1 && y==0));
??
Concurrent revisionsVersioned<int> x =
0Versioned<int> y =
0
if (y==0) x++; if (x==0) y++;
assert(x==1 && y==1);
Determinismonly 1 possible
result
Isolationy is always 0
??
Isolationand x is always 0
Conflict resolution
x = 0
x = 1
x = 2
assert(x==2)
x = 0
x = 1
x = 0
assert(x==0)
x = 0
x = 1
assert(x==1)
By default, on a write-write conflict (only), the modification in the child revision wins.
Versioned<int> x;
Custom conflict resolution
x = 0
x += 1
x += 2
assert(x==3)
merge(1,2,0) 3
Cumulative<int,(main,join,orig).main + join – orig> x;
1
2
0
Demo class Sample{ [Versioned] int i = 0;
public void Run() { var r = CurrentRevision.Fork(() => { i += 1; });
i += 2;
CurrentRevision.Join(r); Console.WriteLine("i = " + i); }}
Demo: Sandboxclass Sandbox{ [Versioned] int i = 0;
public void Run() { var r = CurrentRevision.Branch("FlakyCode"); try { r.Run(() => { i = 1; throw new Exception("Oops"); }); CurrentRevision.Merge(r); } catch { CurrentRevision.Abandon(r); } Console.WriteLine("\n i = " + i); }}
Fork a revision without forking an associated task/thread
Run code in a certain revision
Merge changes in a revision into the main one
Abandon a revision and don’t merge its changes.
A Software engineering perspective• Transactional memory:
Code centric: put “atomic” in the code Granularity: • too broad: too many conflicts and no parallel
speedup• too small: potential races and incorrect code
• Concurrent revisions: Data centric: put annotations on the data Granularity: group data that have mutual constraints
together, i.e. if (x + y > 0) should hold, then x and y should be versioned together.
• For each versioned object, maintain multiple copies Map revision ids to versions `mostly’ lock-free array
• New copies are allocated lazily Don’t copy on fork… copy on first write after fork
• Old copies are released on join No space leak
Current Implementation: C# library
Revision Value
1 0
40 2
45 7
Full algorithm in the paper…
SpaceWars Game
Shared State
Parallel Collision DetectionParallel Collision DetectionParallel Collision Detection
Graphics Card
Network Connection
Play Sounds
Render Screen
Process InputsAutosave
Send
Receive
Disk
Key-board
SimulatePhysics
Sequential Game Loop:
Revision Diagram for Parallelized Game Loop
Coll.
Det
. 1
Coll.
Det
. 2
Coll.
Det
. 3
Coll.
Det
. 4Re
nder
Phys
ics
netw
ork
auto
save
(lo
ng ru
nnin
g)
“Problem Example 1” is solved
Render task reads position of all game objects
Physics task updates position of all game objects
No interference! Coll.
Det
. 1
Coll.
Det
. 2
Coll.
Det
. 3
Coll.
Det
. 4Re
nder
Phys
ics
netw
ork
auto
save
(lo
ng ru
nnin
g)
“Problem Example 2” is solved.
Physics task updates position of all game objects
Network task updates position of some objects
Network updates have priority over physics updates
Order of joins establishes precedence!
Coll.
Det
. 1
Coll.
Det
. 2
Coll.
Det
. 3
Coll.
Det
. 4Re
nder
Phys
ics
netw
ork
auto
save
(lo
ng ru
nnin
g)
Autosave now perfectly unnoticeable in background Overall Speed-Up: 3.03x on four-core
(almost completely limited by graphics card)
Results Physics task Render
Collision detection
Overhead:How much does all the copying and the indirection cost?
Only a 5% slowdown in the sequential case Some individual tasks
slow down much more(i.e. physics simulation)
Revisions and Isolation Types simplify the parallelization of applications with tasks that Exhibit conflicting accesses to shared data Have unpredictable latency Have unpredictable data access pattern May perform I/O that can not be rolled back
Revisions and Isolation Types are easy to reason about (determinism, isolation) have low-enough overhead for many applications
Conclusion
Questions?
• daan@microsoft.com• sburckha@microsoft.com
• External download available soon
int x = 0;int y = 0;task t = fork { if (x==0) y++;}if (y==0) x++;join t;
int x = 0;int y = 0;task t = fork { atomic { if (x==0) y++; }}atomic { if (y==0) x++; }join t;
versioned<int> x = 0;versioned<int> y = 0;revision r = rfork { if (x==0) y++;}if (y==0) x++;join r;
Sequential Consistency
Transactional Memory
Concurrent Revisions
assert( (x==0 && y==1) || (x==1 && y==0) || (x==1 && y==1));
assert( (x==0 && y==1) || (x==1 && y==0));
assert(x==1 && y==1);
x = 0
x += 1
x += 2
merge(1,2,0) 3
x += 3
assert( x==6 )
merge(3,5,2) 6
1 2
0
3 5
2
By construction, there is no ‘global’ state: just local state for
each revision
State is simply a (partial) function from a location to a
value
Operational SemanticsFor some revision r, with snapshot and local modifications and an
expression context with hole (x.e) v
the state is a composition of the root snapshot and
local modifications
On a fork, the snapshot of the new
revision r’ is the current state: ::
On a join, the writes of the joinee r’ take priority over the writes of the current
revision: ::’
Custom merge: per location (type)
On a join, using a merge function.
No conflict if a location was not
written in the joinee
No conflict if a location was unmodified in the current revision, use
the value of the joinee
Conflict otherwise, use a location/type specific
merge function
Standard merges:
What is a conflict?• Merge is only called if:
(1) write in child, and (2) modification in main revision:
Cumulative<int> x = 0
x += 2
x += 3
assert( x = 5 )
0
2 5
2
No conflict (merge function
is not called)
No conflict (merge function
is not called)
0
2
Merging with failure
On fail, we just ignore any writes in the joinee
Snapshot isolation• Widely used in databases, for example Oracle
and Microsoft SQL • In essence, in snapshot isolation a concurrent
transaction can only complete in the absence of write-write conflicts.
• Our calculus generalizes snapshot isolation: We support arbitrary nesting We allow custom merge functions to resolve
write-write conflicts deterministically
Snapshot isolationWe can succinctly model snapshot isolation as:• Disallow nesting• Use the default merge:
Some versions of snapshot isolation do not treat silent writes in a transaction as a conflict:
Sequential merges• We can view each location as an abstract data
types (i.e. object) with certain operations (i.e. methods).
• If a merge function always behaves as if concurrent operations for those objects are sequential, we call it a sequential merge.
• Such objects always behave as if the operations in the joinee are all done sequentially at the join point.
Sequential merges• A merge is sequential if:
merge(uw1(o), uw2(o), u(o))=
uw1w2(o)
• And uw1w2(o)
u
w1 w2
merge(uw1(o), uw2(o), u(o))
x = o
Abelian merges• For any abstract data type that forms an
abelian group (associative, commutative, with inverses) with neutral element 0 and an operation , the following merge is sequential:
merge(v,v’,v0) = v v’ v0
• This holds for example for additive integers and additive sets.
top related