how to implement any concurrent data structure · effort in 2012–2014 the future(s) of shared...

23
how to implement any concurrent data structure marcos k. aguilera vmware jointly with irina calciu siddhartha sen mahesh balakrishnan

Upload: others

Post on 21-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

how to implementany

concurrent data structure marcos k. aguilera

vmware

jointly withirina calciu

siddhartha senmahesh balakrishnan

Page 2: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

Where to find more information about this work

How to Implement Any Concurrent Data Structure.By Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, Marcos K. Aguilera.Communications of the ACM, 2018

Black-box Concurrent Data Structures for NUMA Architectures.Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, Marcos K. Aguilera.ASPLOS, 2017

Page 3: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

concurrent data structuresare everywhere

kernel

application libraries

applications

Page 4: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

but efficient ones are hard to design

locks

transactional memory

lock-free and wait-free

Page 5: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

effort in 2012–2014The Future(s) of Shared Data StructuresAlex Kogan and Maurice HerlihyPODC 2014

Concurrent Updates with RCU: Search Tree as an ExampleMaya Arbel and Hagit AttiyaPODC 2014

Dynamic-Sized Nonblocking Hash TablesYujie Liu, Kunlong Zhang and Michael SpearPODC 2014

Efficient Lock-free Binary Search TreesBapi Chatterjee, Nhan Nguyen and Philippas TsigasPODC 2014

The Amortized Complexity of Non-blocking Binary Search TreesFaith Ellen, Panagiota Fatourou, Joanna Helga and Eric RuppertPODC 2014

The Adaptive Priority Queue with Elimination and CombiningIrina Calciu, Hammurabi Mendes and Maurice HerlihyDISC 2014

Solo-fast Universal Constructions for Deterministic Abortable ObjectsClaire Capdevielle, Colette Johnen and Alessia MilaniDISC 2014

On Deterministic Abortable ObjectsVassos Hadzilacos and Sam TouegPODC 2013

Leaplist: Lessons Learned in Designing TM-Supported Range QueriesHillel Avni, Nir Shavit, and Adi SuissaPODC 2013

The SkipTrie: Low-Depth Concurrent Search without RebalancingRotem Oshman and Nir ShavitPODC 2013

Pragmatic Primitives for Non-blocking Data StructuresTrevor Brown, Faith Ellen, and Eric RuppertPODC 2013

Lock-Free Data Structure IteratorsErez Petrank and Shahar TimnatDISC 2013

Practical Non-blocking Unordered ListsKunlong Zhang, Yujiao Zhao, Yajun Yang, Yujie Liu and Michael SpearDISC 2013

Atomic snapshots in expected $O(\log^3 n)$ steps using randomized helpingJames Aspnes and Keren Censor-HillelDISC 2013

An Optimal Implementation of Fetch-and-IncrementFaith Ellen and Philipp WoelfelDISC 2013

On the Time and Space Complexity of Randomized Test-And-Set George Giakkoupis and Philipp WoelfelPODC 2012

Universal Constructions that Ensure Disjoint-Access Parallelism and Wait-Freedom Faith Ellen, Panagiota Fatourou, Eleftherios Kosmas, Alessia Milani, and CorentinTraversPODC 2012

Faster than Optimal Snapshots (for a While) James Aspnes, Hagit Attiya, Keren Censor-Hillel, and Faith EllenPODC 2012

Strongly Linearizable Implementations: Possibilities and Impossibilities Maryam Helmi, Lisa Higham, and Philipp WoelfelPODC 2012

CBTree: A Practical Concurrent Self-Adjusting Search TreeYehuda Afek, Haim Kaplan, Boris Korenfeld, Adam Morrison, Robert E. TarjanDISC 2012

Efficient Fetch-and-IncrementFaith Ellen, Vijaya Ramachandran, Philipp WoelfelDISC 2012

Page 6: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

problems withconcurrent data structure design

herculean effort for each data structure

rigid designs

an even greater problem…

Page 7: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

problems withconcurrent data structure design

herculean effort for each data structure

rigid designs

an even greater problem…new hardware architectures

Page 8: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

our options?1. underutilize the system

2. develop new data structures…

3. we think there is a better way

for each new architecture

Page 9: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

architecture-awareblack-box data structures

sequential data structures

architecture 1

architecture 2

transformation 1

transformation 2

architecture 3transformation 3

Page 10: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

architecture-awareblack-box data structures

sequential data structures

architecture 1

architecture 2

transformation 1

transformation 2

architecture 3transformation 3

FOCUS OF REST OF TALK NUMAarchitecture

Page 11: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

the NR algorithm

Page 12: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA architectureNon-Uniform Memory Access

❖ local access more efficient

core

cache

core

cache

core

cache

core

cachecache

core

cache

core

cache

core

cache

core

cachecache

memory memory

node node

Page 13: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

evaluation

Intel Xeon E7-4850v356 cores, 4 nodes

2.2 GHz512 GB RAML3 35 MBL2 256 KBL1 64 KB

Page 14: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

0

20

40

60

1 28 56 84 110

op

s/u

s

# threads

skip list priority queue – 10% updates(FC+) FC + RWL (RWL) Readers-Writer Lock

(SL) Spinlock(FC) Flat CombiningX

(NR) Node ReplicationX

(LF) Lock-free

Page 15: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

0

2

4

6

1 28 56 84 110

op

s/u

s

# threads

data structure in REDIS: 10% updates(NR) Node Replication (FC+) FC + RWL (RWL) Readers-Writer Lock

(FC) Flat Combining (SL) SpinlockX

X

Page 16: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

the transformation

given single-threadedexecute(op,parameters)

isReadOnly(op)

we produce multi-threadedexecute(op,parameters)

works well in NUMA servers

Page 17: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

key ideas

1. replicate data structure across (NUMA) nodesstate machine approach with a shared log

2. provide efficient NUMA-aware loglarge effort to optimize log

Page 18: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA Node

Local Replica

the transformation

ThreadThread

NUMA Node

Local Replica

ThreadThread

Page 19: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

NUMA Node

Local Replica

Local Tail

the transformation

Shared Log

LogTail

ThreadThread

NUMA Node

Local Replica

Local Tail

ThreadThread

Page 20: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

how to implement log?

key observationcoordination within node cheaper than across nodes

within node: we use flat combining

across nodes: we use lock-free appending to log

Page 21: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

correctness

linearizability [Herlihy Wing 1990]:each operation appears to take effect instantaneously at a point between its invocation and response

Page 22: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

whence performance comes• trade memory + computation for less communication• compact representation of operations• limited cross-node synchronization and contention

• enable parallelism • combiners across nodes• readers within a node • readers and the combiner on the same node

• leverage batching

22

Page 23: how to implement any concurrent data structure · effort in 2012–2014 The Future(s) of Shared Data Structures Alex Koganand Maurice Herlihy PODC 2014 Concurrent Updates with RCU:

conclusion• fundamental changes in hardware

• exposed to software developers

• take-away:instead of individual data structures,let’s develop general architecture-aware techniques