concurrent queues and stacks companion slides for the art of multiprocessor programming by maurice...
TRANSCRIPT
Concurrent Queues and Stacks
Companion slides forThe Art of Multiprocessor Programming
by Maurice Herlihy & Nir Shavit
Art of Multiprocessor Programming 22
The Five-Fold Path
• Coarse-grained locking
• Fine-grained locking
• Optimistic synchronization
• Lazy synchronization
• Lock-free synchronization
Art of Multiprocessor Programming 33
Another Fundamental Problem
• We told you about – Sets implemented by linked lists
• Next: queues
• Next: stacks
Art of Multiprocessor Programming 44
Queues & Stacks
• pool of items
Art of Multiprocessor Programming 55
Queues
deq()/enq( )
Total orderFirst in
First out
Art of Multiprocessor Programming 66
Stacks
pop()/
push( )
Total orderLast in
First out
Art of Multiprocessor Programming 77
Bounded
• Fixed capacity
• Good when resources an issue
Art of Multiprocessor Programming 88
Unbounded
• Unlimited capacity
• Often more convenient
…
Art of Multiprocessor Programming 9
Blockingzzz …
Block on attempt to remove from empty stack or queue
Art of Multiprocessor Programming 10
Blockingzzz …
Block on attempt to add to full bounded stack or queue
Art of Multiprocessor Programming 11
Non-BlockingOuch!
Throw exception on attempt to remove from empty stack or queue
Art of Multiprocessor Programming 1212
This Lecture
• Queue– Bounded, blocking, lock-based– Unbounded, non-blocking, lock-free
• Stack– Unbounded, non-blocking lock-free– Elimination-backoff algorithm
Art of Multiprocessor Programming 1313
Queue: Concurrency
enq(x) y=deq()
enq() and deq() work at different
ends of the object
tail head
Art of Multiprocessor Programming 1414
Concurrency
enq(x)
Challenge: what if the queue is empty
or full?
y=deq()ta
ilhead
Art of Multiprocessor Programming 1515
Bounded Queue
Sentinel
head
tail
Art of Multiprocessor Programming 1616
Bounded Queue
head
tail
First actual item
Art of Multiprocessor Programming 1717
Bounded Queue
head
tail
Lock out other deq() calls
deqLock
Art of Multiprocessor Programming 1818
Bounded Queue
head
tail
Lock out other enq() calls
deqLock
enqLock
Art of Multiprocessor ProgrammingArt of Multiprocessor Programming 19191919
Not Done Yet
headhead
tailtail
deqLockdeqLock
enqLockenqLock
Need to tell whether queue is full or
empty
Art of Multiprocessor Programming 2020
Not Done Yet
head
tail
deqLock
enqLock
Max size is 8 items
size
1
Art of Multiprocessor Programming 2121
Not Done Yet
head
tail
deqLock
enqLock
Incremented by enq()Decremented by deq()
size
1
Art of Multiprocessor Programming 2222
Enqueuer
head
tail
deqLock
enqLock
size
1
Lock enqLock
Art of Multiprocessor Programming 2323
Enqueuer
head
tail
deqLock
enqLock
size
1
Read size
OK
Art of Multiprocessor Programming 2424
Enqueuer
head
tail
deqLock
enqLock
size
1
No need to lock tail
Art of Multiprocessor Programming 2525
Enqueuer
head
tail
deqLock
enqLock
size
1
Enqueue Node
Art of Multiprocessor Programming 2626
Enqueuer
head
tail
deqLock
enqLock
size
12
getAndincrement()
Art of Multiprocessor Programming 2727
Enqueuer
head
tail
deqLock
enqLock
size
8Release lock
2
Art of Multiprocessor Programming 2828
Enqueuer
head
tail
deqLock
enqLock
size
2
If queue was empty, notify waiting dequeuers
Art of Multiprocessor Programming 2929
Unsuccesful Enqueuer
head
tail
deqLock
enqLock
size
8
Uh-oh
Read size
…
Art of Multiprocessor Programming 3030
Dequeuer
head
tail
deqLock
enqLock
size
2
Lock deqLock
Art of Multiprocessor Programming 3131
Dequeuer
head
tail
deqLock
enqLock
size
2
Read sentinel’s next field
OK
Art of Multiprocessor Programming 3232
Dequeuer
head
tail
deqLock
enqLock
size
2
Read value
Art of Multiprocessor Programming 3333
Dequeuer
head
tail
deqLock
enqLock
size
2
Make first Node new sentinel
Art of Multiprocessor Programming 3434
Dequeuer
head
tail
deqLock
enqLock
size
1
Decrement size
Art of Multiprocessor Programming 3535
Dequeuer
head
tail
deqLock
enqLock
size
1Release deqLock
Art of Multiprocessor Programming 3636
Unsuccesful Dequeuer
head
tail
deqLock
enqLock
size
0
Read sentinel’s next field
uh-oh
Art of Multiprocessor Programming 3737
Bounded Queue
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
Art of Multiprocessor Programming 3838
Bounded Queue
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
enq & deq locks
Art of Multiprocessor Programming 3939
Digression: Monitor Locks
• Java synchronized objects and ReentrantLocks are monitors
• Allow blocking on a condition rather than spinning
• Threads: – acquire and release lock– wait on a condition
Art of Multiprocessor Programming 4040
public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}
The Java Lock Interface
Acquire lock
Art of Multiprocessor Programming 4141
public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}
The Java Lock Interface
Release lock
Art of Multiprocessor Programming 4242
public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}
The Java Lock Interface
Try for lock, but not too hard
Art of Multiprocessor Programming 4343
public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}
The Java Lock Interface
Create condition to wait on
Art of Multiprocessor Programming 4444
The Java Lock Interface
public interface Lock { void lock(); void lockInterruptibly() throws InterruptedException; boolean tryLock(); boolean tryLock(long time, TimeUnit unit); Condition newCondition(); void unlock;}
Never mind what this method does
Art of Multiprocessor Programming 4545
Lock Conditions
public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }
Art of Multiprocessor Programming 4646
public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }
Lock Conditions
Release lock and wait on condition
Art of Multiprocessor Programming 4747
public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }
Lock Conditions
Wake up one waiting thread
Art of Multiprocessor Programming 4848
public interface Condition { void await(); boolean await(long time, TimeUnit unit); … void signal(); void signalAll(); }
Lock Conditions
Wake up all waiting threads
Art of Multiprocessor Programming 4949
Await
• Releases lock associated with q
• Sleeps (gives up processor)
• Awakens (resumes running)
• Reacquires lock & returns
q.await()
Art of Multiprocessor Programming 5050
Signal
• Awakens one waiting thread– Which will reacquire lock
q.signal();
Art of Multiprocessor Programming 5151
Signal All
• Awakens all waiting threads– Which will each reacquire lock
q.signalAll();
Art of Multiprocessor Programming 5252
A Monitor Lock
Cri
tica
l S
ecti
on
waiting roomlock()
unlock()
Art of Multiprocessor Programming 5353
Unsuccessful Deq
Cri
tica
l S
ecti
on
waiting roomlock()
await()deq()
Oh no, empty!
Art of Multiprocessor Programming 5454
Another One
Cri
tica
l S
ecti
on
waiting roomlock()
await()deq()
Oh no, empty!
Art of Multiprocessor Programming 5555
Enqueuer to the Rescue
Cri
tica
l S
ecti
on
waiting roomlock()
signalAll()enq( )
unlock()
Yawn!Yawn!
Art of Multiprocessor Programming 5656
Yawn!
Monitor Signalling
Cri
tica
l S
ecti
on
waiting room
Yawn!
Awakened thread might still lose lock to outside contender…
Art of Multiprocessor Programming 5757
Dequeuers Signalled
Cri
tica
l S
ecti
on
waiting room
Found it
Yawn!
Art of Multiprocessor Programming 5858
Yawn!
Dequeuers Signaled
Cri
tica
l S
ecti
on
waiting room
Still empty!
Art of Multiprocessor Programming 5959
Dollar Short + Day Late
Cri
tica
l S
ecti
on
waiting room
Art of Multiprocessor Programming 6060
public class Queue<T> {
int head = 0, tail = 0; T[QSIZE] items;
public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}
Java Synchronized Methods
Art of Multiprocessor Programming 6161
public class Queue<T> {
int head = 0, tail = 0; T[QSIZE] items;
public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}
Java Synchronized Methods
Each object has an implicit lock with an implicit condition
Art of Multiprocessor Programming 6262
public class Queue<T> {
int head = 0, tail = 0; T[QSIZE] items;
public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}
Java Synchronized Methods
Lock on entry, unlock on return
Art of Multiprocessor Programming 6363
public class Queue<T> {
int head = 0, tail = 0; T[QSIZE] items;
public synchronized T deq() { while (tail – head == 0) wait(); T result = items[head % QSIZE]; head++; this.notifyAll(); return result; } …}}
Java Synchronized Methods
Wait on implicit condition
Art of Multiprocessor Programming 6464
public class Queue<T> {
int head = 0, tail = 0; T[QSIZE] items;
public synchronized T deq() { while (tail – head == 0) this.wait(); T result = items[head % QSIZE]; head++; notifyAll(); return result; } …}}
Java Synchronized Methods
Signal all threads waiting on condition
Art of Multiprocessor Programming 6565
(Pop!) The Bounded Queue
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
Art of Multiprocessor Programming 6666
Bounded Queue Fields
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
Enq & deq locks
Art of Multiprocessor Programming 6767
Bounded Queue Fields
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
Enq lock’s associated condition
Art of Multiprocessor Programming 6868
Bounded Queue Fields
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
size: 0 to capacity
Art of Multiprocessor Programming 6969
Bounded Queue Fields
public class BoundedQueue<T> { ReentrantLock enqLock, deqLock; Condition notEmptyCondition, notFullCondition; AtomicInteger size; Node head; Node tail; int capacity; enqLock = new ReentrantLock(); notFullCondition = enqLock.newCondition(); deqLock = new ReentrantLock(); notEmptyCondition = deqLock.newCondition();}
Head and Tail
Art of Multiprocessor Programming 7070
Enq Method Part Onepublic void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == Capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Art of Multiprocessor Programming 7171
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Enq Method Part One
Lock and unlock enq lock
Art of Multiprocessor Programming 7272
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Enq Method Part One
Wait while queue is full …
Art of Multiprocessor Programming 7373
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Enq Method Part One
when await() returns, you might still fail the test !
Art of Multiprocessor Programming 7474
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Be Afraid
After the loop: how do we know the queue won’t become full again?
Art of Multiprocessor Programming 7575
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Enq Method Part One
Add new node
Art of Multiprocessor Programming 7676
public void enq(T x) { boolean mustWakeDequeuers = false; enqLock.lock(); try { while (size.get() == capacity) notFullCondition.await(); Node e = new Node(x); tail.next = e; tail = tail.next; if (size.getAndIncrement() == 0) mustWakeDequeuers = true; } finally { enqLock.unlock(); } …}
Enq Method Part One
If queue was empty, wake frustrated dequeuers
Art of Multiprocessor Programming 7777
Beware Lost Wake-Ups
Cri
tica
l S
ecti
on
waiting roomlock()
Queue empty so signal ()
enq( )
unlock()
Yawn!
Art of Multiprocessor Programming 7878
Lost Wake-Up
Cri
tica
l S
ecti
on
waiting roomlock()
enq( )
unlock()
Yawn!
Queue not empty so no
need to signal
Art of Multiprocessor Programming 7979
Lost Wake-Up
Cri
tica
l S
ecti
on
waiting room
Yawn!
Art of Multiprocessor Programming 8080
Lost Wake-Up
Cri
tica
l S
ecti
on
waiting room
Found it
Art of Multiprocessor Programming 8181
What’s Wrong Here?
Cri
tica
l S
ecti
on
waiting room
Still waiting ….!
Art of Multiprocessor Programming 82
Solution to Lost Wakeup
• Always use– signalAll() and notifyAll()
• Not– signal() and notify()
82
Art of Multiprocessor Programming 8383
Enq Method Part Deux
public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }
Art of Multiprocessor Programming 8484
Enq Method Part Deux
public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }Are there dequeuers to be signaled?
Art of Multiprocessor Programming 8585
public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }
Enq Method Part Deux
Lock and unlock deq lock
Art of Multiprocessor Programming 8686
public void enq(T x) { … if (mustWakeDequeuers) { deqLock.lock(); try { notEmptyCondition.signalAll(); } finally { deqLock.unlock(); } } }
Enq Method Part Deux
Signal dequeuers that queue is no longer empty
Art of Multiprocessor Programming 8787
The Enq() & Deq() Methods
• Share no locks– That’s good
• But do share an atomic counter– Accessed on every method call– That’s not so good
• Can we alleviate this bottleneck?
Art of Multiprocessor Programming 8888
Split the Counter
• The enq() method– Increments only– Cares only if value is capacity
• The deq() method– Decrements only– Cares only if value is zero
Art of Multiprocessor Programming 8989
Split Counter
• Enqueuer increments enqSize• Dequeuer decrements deqSize• When enqueuer runs out
– Locks deqLock– computes size = enqSize - DeqSize
• Intermittent synchronization– Not with each method call– Need both locks! (careful …)
Art of Multiprocessor Programming 9090
A Lock-Free Queue
Sentinel
head
tail
Art of Multiprocessor Programming 9191
Compare and Set
CAS
Art of Multiprocessor Programming 9292
Enqueue
head
tail
enq( )
Art of Multiprocessor Programming 9393
Enqueue
head
tail
Art of Multiprocessor Programming 9494
Logical Enqueue
head
tail
CAS
Art of Multiprocessor Programming 9595
Physical Enqueue
head
tailCAS
Art of Multiprocessor Programming 9696
Enqueue
• These two steps are not atomic
• The tail field refers to either– Actual last Node (good)– Penultimate Node (not so good)
• Be prepared!
Art of Multiprocessor Programming 9797
Enqueue
• What do you do if you find– A trailing tail?
• Stop and help fix it– If tail node has non-null next field– CAS the queue’s tail field to tail.next
• As in the universal construction
Art of Multiprocessor Programming 9898
When CASs Fail
• During logical enqueue– Abandon hope, restart– Still lock-free (why?)
• During physical enqueue– Ignore it (why?)
Art of Multiprocessor Programming 9999
Dequeuer
head
tail
Read value
Art of Multiprocessor Programming 100100
Dequeuer
head
tail
Make first Node new sentinel
CAS
Art of Multiprocessor Programming 101101
Memory Reuse?
• What do we do with nodes after we dequeue them?
• Java: let garbage collector deal?
• Suppose there is no GC, or we prefer not to use it?
Art of Multiprocessor Programming 102102
Dequeuer
head
tail
CAS
Can recycle
Art of Multiprocessor Programming 103103
Simple Solution
• Each thread has a free list of unused queue nodes
• Allocate node: pop from list
• Free node: push onto list
• Deal with underflow somehow …
Art of Multiprocessor Programming 104104
Why Recycling is Hard
Free pool
head tail
Want to redirect
head from gray to red
zzz…
Art of Multiprocessor Programming 105105
Both Nodes Reclaimed
Free pool
zzz
head tail
Art of Multiprocessor Programming 106106
One Node Recycled
Free pool
Yawn!
head tail
Art of Multiprocessor Programming 107107
Why Recycling is Hard
Free pool
CAShead tail
OK, here I go!
Art of Multiprocessor Programming 108108
Recycle FAIL
Free pool
zOMG what went wrong?
head tail
Art of Multiprocessor Programming 109109
The Dreaded ABA Problemhead tail
Head reference has value AThread reads value A
Head reference has value AThread reads value A
Art of Multiprocessor Programming 110110
Dreaded ABA continued
zzz
head tail
Head reference has value BNode A freed
Head reference has value BNode A freed
Art of Multiprocessor Programming 111111
Dreaded ABA continued
Yawn!
head tail
Head reference has value A again
Node A recycled and reinitialized
Head reference has value A again
Node A recycled and reinitialized
Art of Multiprocessor Programming 112112
Dreaded ABA continued
CAShead tail
CAS succeeds because references match,even though reference’s meaning has changed
CAS succeeds because references match,even though reference’s meaning has changed
Art of Multiprocessor Programming 113113
The Dreaded ABA FAIL
• Is a result of CAS() semantics– I blame Sun, Intel, AMD, …
• Not with Load-Locked/Store-Conditional– Good for IBM?
Art of Multiprocessor Programming 114114
Dreaded ABA – A Solution
• Tag each pointer with a counter
• Unique over lifetime of node
• Pointer size vs word size issues
• Overflow?– Don’t worry be happy?– Bounded tags?
• AtomicStampedReference class
Art of Multiprocessor Programming 115115
Atomic Stamped Reference
• AtomicStampedReference class– Java.util.concurrent.atomic package
address S
Stamp
Reference
Can get reference & stamp atomicallyCan get reference & stamp atomically
Art of Multiprocessor Programming 116116
Concurrent Stack
• Methods– push(x) – pop()
• Last-in, First-out (LIFO) order
• Lock-Free!
Art of Multiprocessor Programming 117117
Empty Stack
Top
Art of Multiprocessor Programming 118118
Push
Top
Art of Multiprocessor Programming 119119
Push
TopCAS
Art of Multiprocessor Programming 120120
Push
Top
Art of Multiprocessor Programming 121121
Push
Top
Art of Multiprocessor Programming 122122
Push
Top
Art of Multiprocessor Programming 123123
Push
TopCAS
Art of Multiprocessor Programming 124124
Push
Top
Art of Multiprocessor Programming 125125
Pop
Top
Art of Multiprocessor Programming 126126
Pop
TopCAS
Art of Multiprocessor Programming 127127
Pop
TopCAS
mine!
Art of Multiprocessor Programming 128128
Pop
TopCAS
Art of Multiprocessor Programming 129129
Pop
Top
Art of Multiprocessor Programming 130130
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff(); }}
Lock-free Stack
Art of Multiprocessor Programming 131131
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public Boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) }public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
tryPush attempts to push a node
Art of Multiprocessor Programming 132132
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
Read top value
Art of Multiprocessor Programming 133133
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
current top will be new node’s successor
Art of Multiprocessor Programming 134134
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
Try to swing top, return success or failure
Art of Multiprocessor Programming 135135
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
Push calls tryPush
Art of Multiprocessor Programming 136136
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
Create new node
Art of Multiprocessor Programming 137137
public class LockFreeStack { private AtomicReference top = new AtomicReference(null); public boolean tryPush(Node node){ Node oldTop = top.get(); node.next = oldTop; return(top.compareAndSet(oldTop, node)) } public void push(T value) { Node node = new Node(value); while (true) { if (tryPush(node)) { return; } else backoff.backoff()}}
Lock-free Stack
If tryPush() fails,back off before retrying
Art of Multiprocessor Programming 138138
Lock-free Stack
• Good– No locking
• Bad– Without GC, fear ABA– Without backoff, huge contention at top – In any case, no parallelism
Art of Multiprocessor Programming 139139
Big Question
• Are stacks inherently sequential?
• Reasons why– Every pop() call fights for top item
• Reasons why not– Stay tuned …
Art of Multiprocessor Programming 140140
Elimination-Backoff Stack
• How to– “turn contention into parallelism”
• Replace familiar– exponential backoff
• With alternative– elimination-backoff
Art of Multiprocessor Programming 141141
Observation
Push( )
Pop()
linearizable stack
After an equal number of pushes and pops, stack stays the same
After an equal number of pushes and pops, stack stays the same
Yes!
Art of Multiprocessor Programming 142142
Idea: Elimination Array
Push( )
Pop()
stack
Pick at random
Pick at random
Elimination Array
Art of Multiprocessor Programming 143143
Push Collides With Pop
Push( )
Pop()
stack
continue
continue
No need to access stack
No need to access stack
Yes!
Art of Multiprocessor Programming 144144
No Collision
Push( )
Pop()
stack
If no collision, access stack
If no collision, access stack
If pushes collide or pops collide access stack
If pushes collide or pops collide access stack
Art of Multiprocessor Programming 145145
Elimination-Backoff Stack
• Lock-free stack + elimination array
• Access Lock-free stack, – If uncontended, apply operation – if contended, back off to elimination array
and attempt elimination
Art of Multiprocessor Programming 146146
Elimination-Backoff Stack
Push( )
Pop()TopCAS
If CAS fails, back off
Art of Multiprocessor Programming 147147
Dynamic Range and Delay
Push( )
Pick random range and max waiting time based on level of contention
encountered
Pick random range and max waiting time based on level of contention
encountered
Art of Multiprocessor Programming 148148
Linearizability
• Un-eliminated calls– linearized as before
• Eliminated calls:– linearize pop() immediately after matching
push()
• Combination is a linearizable stack
Art of Multiprocessor Programming 149149
Un-Eliminated Linearizability
push(v1)
timetime
linearizable
push(v1)
pop(v1)pop(v1)
Art of Multiprocessor Programming 150150
Eliminated Linearizability
pop(v2)push(v1)
push(v2)
timetime
push(v2)
pop(v2)push(v1)
pop(v1)
Collision Point
Red calls are eliminated
pop(v1)
linearizable
Art of Multiprocessor Programming 151151
Backoff Has Dual Effect
• Elimination introduces parallelism
• Backoff to array cuts contention on lock-free stack
• Elimination in array cuts down number of threads accessing lock-free stack
Art of Multiprocessor Programming 152152
public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray(int capacity) { exchanger = new Exchanger[capacity]; for (int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); … } …}
Elimination Array
Art of Multiprocessor Programming 153153
public class EliminationArray { private static final int duration = ...; private static final int timeUnit = ...; Exchanger<T>[] exchanger; public EliminationArray(int capacity) { exchanger = new Exchanger[capacity]; for (int i = 0; i < capacity; i++) exchanger[i] = new Exchanger<T>(); … } …}
Elimination Array
An array of ExchangersAn array of Exchangers
Art of Multiprocessor Programming 154154
public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0);
Digression: A Lock-Free Exchanger
Art of Multiprocessor Programming 155155
public class Exchanger<T> { AtomicStampedReference<T> slot = new AtomicStampedReference<T>(null, 0);
A Lock-Free Exchanger
Atomically modifiable reference + status
Atomically modifiable reference + status
Art of Multiprocessor Programming 156156
Atomic Stamped Reference
• AtomicStampedReference class– Java.util.concurrent.atomic package
• In C or C++:
address Sreferencereference
stampstamp
Art of Multiprocessor Programming 157157
Extracting Reference & Stamp
public T get(int[] stampHolder);
Art of Multiprocessor Programming 158158
Extracting Reference & Stamp
public T get(int[] stampHolder);
Returns reference to object of type T
Returns reference to object of type T
Returns stamp at array index 0!
Returns stamp at array index 0!
Art of Multiprocessor Programming 159159
Exchanger Status
enum Status {EMPTY, WAITING, BUSY};
Art of Multiprocessor Programming 160160
Exchanger Status
enum Status {EMPTY, WAITING, BUSY};
Nothing yetNothing yet
enum Status {EMPTY, WAITING, BUSY};
Art of Multiprocessor Programming 161161
Exchange Status
Nothing yetNothing yet
One thread is waiting for rendez-vous
One thread is waiting for rendez-vous
Art of Multiprocessor Programming 162162
Exchange Status
enum Status {EMPTY, WAITING, BUSY};
Nothing yetNothing yet
One thread is waiting for rendez-vous
One thread is waiting for rendez-vous
Other threads busy with rendez-vous
Other threads busy with rendez-vous
Art of Multiprocessor Programming 163163
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }
The Exchange
Art of Multiprocessor Programming 164164
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }
The Exchange
Item and timeoutItem and timeout
Art of Multiprocessor Programming 165165
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {EMPTY}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }
The Exchange
Array holds statusArray holds status
Art of Multiprocessor Programming 166166
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }}
The Exchange
Loop until timeoutLoop until timeout
Art of Multiprocessor Programming 167167
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: // slot is free case WAITING: // someone waiting for me case BUSY: // others exchanging } }}
The Exchange
Get other’s item and statusGet other’s item and status
Art of Multiprocessor Programming 168168
public T Exchange(T myItem, long nanos) throws TimeoutException { long timeBound = System.nanoTime() + nanos; int[] stampHolder = {0}; while (true) { if (System.nanoTime() > timeBound) throw new TimeoutException(); T herItem = slot.get(stampHolder); int stamp = stampHolder[0]; switch(stamp) { case EMPTY: … // slot is free case WAITING: … // someone waiting for me case BUSY: … // others exchanging } }}
The Exchange
An Exchanger has three possible statesAn Exchanger has three possible states
Art of Multiprocessor Programming 169169
Lock-free Exchanger
Art of Multiprocessor Programming 170170
Lock-free Exchanger
CAS
Art of Multiprocessor Programming 171171
Lock-free Exchanger
Art of Multiprocessor Programming 172172
Lock-free Exchanger
In search of partner …
Art of Multiprocessor Programming 173173
Lock-free Exchanger
Slot
Still waiting …
Try to exchange item and set
status to BUSY
CAS
Art of Multiprocessor Programming 174174
Lock-free Exchanger
Slot
Partner showed up, take item and reset to EMPTY
item status
Art of Multiprocessor Programming 175175
EMPTY
Lock-free Exchanger
Slot
item status
Partner showed up, take item and reset to EMPTY
Art of Multiprocessor Programming 176176
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
Art of Multiprocessor Programming 177177
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
Try to insert myItem and change state to WAITING
Try to insert myItem and change state to WAITING
Art of Multiprocessor Programming 178178
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
Spin until either myItem is taken or timeout
Spin until either myItem is taken or timeout
Art of Multiprocessor Programming 179179
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
myItem was taken, so return herItem
that was put in its place
myItem was taken, so return herItem
that was put in its place
Art of Multiprocessor Programming 180180
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
Otherwise we ran out of time, try to reset status to EMPTY
and time out
Otherwise we ran out of time, try to reset status to EMPTY
and time out
Art of Multiprocessor Programming 181Art of Multiprocessor Programming© Herlihy-Shavit
2007
181
case EMPTY: // slot is free if (slot.compareAndSet(herItem, myItem, WAITING, BUSY)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.compareAndSet(myItem, null, WAITING, EMPTY)){throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
If reset failed,someone showed up after all,
so take that item
If reset failed,someone showed up after all,
so take that item
Art of Multiprocessor Programming 182182
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
Clear slot and take that item Clear slot and take that item
Art of Multiprocessor Programming 183183
case EMPTY: // slot is free if (slot.CAS(herItem, myItem, EMPTY, WAITING)) { while (System.nanoTime() < timeBound){ herItem = slot.get(stampHolder); if (stampHolder[0] == BUSY) { slot.set(null, EMPTY); return herItem; }} if (slot.CAS(myItem, null, WAITING, EMPTY)){ throw new TimeoutException(); } else { herItem = slot.get(stampHolder); slot.set(null, EMPTY); return herItem; }} break;
Exchanger State EMPTY
If initial CAS failed,then someone else changed status
from EMPTY to WAITING,so retry from start
If initial CAS failed,then someone else changed status
from EMPTY to WAITING,so retry from start
Art of Multiprocessor Programming 184184
case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}
States WAITING and BUSY
Art of Multiprocessor Programming 185185
case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}
States WAITING and BUSY
someone is waiting to exchange,so try to CAS my item in
and change state to BUSY
someone is waiting to exchange,so try to CAS my item in
and change state to BUSY
Art of Multiprocessor Programming 186186
case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}
States WAITING and BUSY
If successful, return other’s item,otherwise someone else took it,
so try again from start
If successful, return other’s item,otherwise someone else took it,
so try again from start
Art of Multiprocessor Programming 187187
case WAITING: // someone waiting for me if (slot.CAS(herItem, myItem, WAITING, BUSY)) return herItem; break;case BUSY: // others in middle of exchanging break;default: // impossible break; } } }}
States WAITING and BUSY
If BUSY,other threads exchanging,
so start again
If BUSY,other threads exchanging,
so start again
Art of Multiprocessor Programming 188188
The Exchanger Slot
• Exchanger is lock-free
• Because the only way an exchange can fail is if others repeatedly succeeded or no-one showed up
• The slot we need does not require symmetric exchange
Art of Multiprocessor Programming 189189
public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}
Back to the Stack: the Elimination Array
Art of Multiprocessor Programming 190190
public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}
Elimination Array
visit the elimination arraywith fixed value and range
visit the elimination arraywith fixed value and range
Art of Multiprocessor Programming 191191
public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}
Elimination Array
Pick a random array entryPick a random array entry
Art of Multiprocessor Programming 192192
public class EliminationArray {…public T visit(T value, int range) throws TimeoutException { int slot = random.nextInt(range); int nanodur = convertToNanos(duration, timeUnit)); return (exchanger[slot].exchange(value, nanodur) }}
Elimination Array
Exchange value or time outExchange value or time out
Art of Multiprocessor Programming 193193
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
Art of Multiprocessor Programming 194194
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
First, try to pushFirst, try to push
Art of Multiprocessor Programming 195195
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
If I failed, backoff & try to eliminateIf I failed, backoff & try to eliminate
Art of Multiprocessor Programming 196196
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
Value pushed and range to tryValue pushed and range to try
Art of Multiprocessor Programming 197197
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
Only pop() leaves null,so elimination was successful
Only pop() leaves null,so elimination was successful
Art of Multiprocessor Programming 198198
public void push(T value) {... while (true) { if (tryPush(node)) { return; } else try { T otherValue = eliminationArray.visit(value,policy.range); if (otherValue == null) { return; }}
Elimination Stack Push
Otherwise, retry push() on lock-free stackOtherwise, retry push() on lock-free stack
Art of Multiprocessor Programming 199199
public T pop() { ... while (true) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if (otherValue != null) { return otherValue; } }}}
Elimination Stack Pop
Art of Multiprocessor Programming 200200
public T pop() { ... while (true) { if (tryPop()) { return returnNode.value; } else try { T otherValue = eliminationArray.visit(null,policy.range; if ( otherValue != null) { return otherValue; } }}}
Elimination Stack Pop
If value not null, other thread is a push(),so elimination succeeded
If value not null, other thread is a push(),so elimination succeeded
Art of Multiprocessor Programming 201201
Summary• We saw both lock-based and lock-free
implementations of • queues and stacks• Don’t be quick to declare a data
structure inherently sequential– Linearizable stack is not inherently
sequential (though it is in worst case)
• ABA is a real problem, pay attention
Art of Multiprocessor Programming 202202
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
• You are free:– to Share — to copy, distribute and transmit the work – to Remix — to adapt the work
• Under the following conditions:– Attribution. You must attribute the work to “The Art of
Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work).
– Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.
• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to– http://creativecommons.org/licenses/by-sa/3.0/.
• Any of the above conditions can be waived if you get permission from the copyright holder.
• Nothing in this license impairs or restricts the author's moral rights.
Art of Multiprocessor Programming 203