a dynamic elimination-combining stack algorithm

42
A Dynamic Elimination- Combining Stack Algorithm Gal Bar-Nissan, Danny Hendler and Adi Suissa Department of Computer Science, BGU, January 2011 Presnted by: Ilya Mirsky 28.03.2011

Upload: naava

Post on 23-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

A Dynamic Elimination-Combining Stack Algorithm. Gal Bar-Nissan, Danny Hendler and Adi Suissa Department of Computer Science, BGU, January 2011. Presnted by: Ilya Mirsky 28.03.2011. Outline. Concurrent programming terms Motivation Introduction DECS: The Algorithm - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Dynamic Elimination-Combining Stack Algorithm

A Dynamic Elimination-Combining Stack AlgorithmGal Bar-Nissan, Danny Hendler and Adi SuissaDepartment of Computer Science, BGU, January 2011

Presnted by: Ilya Mirsky 28.03.2011

Page 2: A Dynamic Elimination-Combining Stack Algorithm

2

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 3: A Dynamic Elimination-Combining Stack Algorithm

3

Concurrent programming terms Locks (coarse and fine grained) Non blocking algorithms

Wait-freedom Lock-freedom Obstruction-freedom

Linearizability Memory Contention Latency

Page 4: A Dynamic Elimination-Combining Stack Algorithm

4

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 5: A Dynamic Elimination-Combining Stack Algorithm

5

Motivation Concurrent stacks are widely used in

parallel applications and operating systems. A simple implementation using coarse

grained locking mechanism causes a “hot spot” at the central stack object and poses a sequential bottleneck.

There is a need in a scalable concurrent stack, which presents a good performance under low, medium and high workloads, with no dependency in the ratio of the operations type (push/ pop).

Page 6: A Dynamic Elimination-Combining Stack Algorithm

6

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 7: A Dynamic Elimination-Combining Stack Algorithm

7

Introduction Two key synchronization paradigms for construction of

scalable concurrent data structures are software combining and elimination.

The most highly scalable concurrent stack algorithm previously known is the lock-free elimination-backoff stack )Hendler, Shavit, Yershalmi).

The HSY stack is highly efficient under low contention, as well as under high contention when workload is symmetric.

Unfortunately, when workloads are asymmetric, the performance of HSY deteriorates to a sequential stack.

Flat-combining (by Hendler et al.) significantly outperforms HSY in low and medium contentions, but it does not scale and even deteriorates at high contention level.

Page 8: A Dynamic Elimination-Combining Stack Algorithm

8

Introduction - DECS DECS employs both combining & elimination

mechanism. Scales well for all workload types, and

outperforms other stack implementations. Maintains the simplicity and low overhead of

the HSY stack. Uses a contention-reduction layer as a backoff

scheme for a central stack- an elimination-combining layer.

A non blocking implementation is presented, NB-DECS, a lock-free variant of DECS in which threads that have waited for too long may cancel their “combining contract” and retry their operation on the central stack.

Page 9: A Dynamic Elimination-Combining Stack Algorithm

9

Introduction - DECS

Page 10: A Dynamic Elimination-Combining Stack Algorithm

10

Introduction - DECS

CentralStack

Elimination-combining layer

Page 11: A Dynamic Elimination-Combining Stack Algorithm

11

Introduction - DECS

CentralStack

Elimination-combining layer

Page 12: A Dynamic Elimination-Combining Stack Algorithm

12

Introduction - DECS

CentralStack

zzz…

zzz…

zzz…

Elimination-combining layer

Page 13: A Dynamic Elimination-Combining Stack Algorithm

13

Introduction - DECSzzz…

zzz…

zzz…

Wake up!

CentralStack

Elimination-combining layer

Page 14: A Dynamic Elimination-Combining Stack Algorithm

14

Introduction - DECS

CentralStack

zzz…

Elimination-combining layer

Page 15: A Dynamic Elimination-Combining Stack Algorithm

15

Introduction - DECS

CentralStack

zzz…

Elimination-combining layer

Page 16: A Dynamic Elimination-Combining Stack Algorithm

16

Introduction - DECS

CentralStack

zzz…

Elimination-combining layer

Page 17: A Dynamic Elimination-Combining Stack Algorithm

17

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 18: A Dynamic Elimination-Combining Stack Algorithm

18

DECS- The Algorithm The data structures

1 6 4Collision Array

Locations Array

MultiOpint id;int op;int length;int cStatus;Cell cell;MultiOp next;MultiOp last;

CellData data;Cell next;

CellData data;Cell next;

CellData data;Cell next;

CellData data;Cell next;

CentralStack

Elimination-combining layer

Page 19: A Dynamic Elimination-Combining Stack Algorithm

19

DECS- The Algorithm

CentralStack

push(data1)

push(data2)

pop()

I wish there was someone

in similar situation…

I wish there was someone

in similar situation…

Page 20: A Dynamic Elimination-Combining Stack Algorithm

20

DECS- The Algorithm

multiOp tInfo = initMultiOp();

multiOp tInfo = initMultiOp(data);

Page 21: A Dynamic Elimination-Combining Stack Algorithm

DECS- The Algorithm

21

Collision Array

Locations Array

T. 6

T. 2

MultiOpid = 2op = POPlength = 1cStatus = INITcellnext = NULLlast

EMPTY

MultiOpid = 6op = PUSHlength = 1cStatus = INITcellnext = NULLlast

data1

…4

…4

EMPTY 6

6

I’ll wait, maybe

someone will arrive…

Yay, I can collide with

thread 6!

Active collider

Passive collider

Page 22: A Dynamic Elimination-Combining Stack Algorithm

DECS- The Algorithm Central Stack Functions

Page 23: A Dynamic Elimination-Combining Stack Algorithm

23

DECS- The Algorithm

Page 24: A Dynamic Elimination-Combining Stack Algorithm

24

DECS- The Algorithm

Page 25: A Dynamic Elimination-Combining Stack Algorithm

25

DECS- The Algorithm

T. 6

T. 2

zzz…

Collision Array

Locations Array

MultiOpid = 2op = POPlength = 1cStatus = INITcellnext = NULLlast

EMPTY

MultiOpid = 6op = PUSHlength = 1cStatus = INITcellnext = NULLlast

data1

I see that T. 6 got PUSH, and I got POP-

we can eliminate!

Page 26: A Dynamic Elimination-Combining Stack Algorithm

26

DECS- The Algorithm Elimination-Combining Layer Functions

Page 27: A Dynamic Elimination-Combining Stack Algorithm

27

DECS- The Algorithm

T. 6

T. 2

zzz…

MultiOpid = 2op = POPlength = 1cStatus = INITcellnext = NULLlast

EMPTY

MultiOpid = 6op = PUSHlength = 1cStatus = INITcellnext = NULLlast

data1

MultiOpid = 6op = PUSHlength = 0cStatus = FINISHEDcellnext = NULLlast

MultiOpid = 2op = POPlength = 0cStatus = FINISHEDcellnext = NULLlast

Working…

Page 28: A Dynamic Elimination-Combining Stack Algorithm

28

DECS- The Algorithm

T. 6

T. 2

zzz…

MultiOpid = 2op = POPlength = 1cStatus = INITcellnext = NULLlast

MultiOpid = 6op = PUSHlength = 1cStatus = INITcellnext = NULLlast

data1

MultiOpid = 6op = PUSHlength = 0cStatus = FINISHEDcellnext = NULLlast

MultiOpid = 2op = POPlength = 0cStatus = FINISHEDcellnext = NULLlast

Working…Done!

Page 29: A Dynamic Elimination-Combining Stack Algorithm

29

DECS- The Algorithm

Page 30: A Dynamic Elimination-Combining Stack Algorithm

30

DECS- The Algorithm

T. 6

T. 2

zzz…

Wake up man, I’ve done your

job!

Thank you T. 2, let’s go

have a beer; I’m buying!

Page 31: A Dynamic Elimination-Combining Stack Algorithm

31

DECS- The Algorithm

Page 32: A Dynamic Elimination-Combining Stack Algorithm

32

DECS- The Algorithm

Page 33: A Dynamic Elimination-Combining Stack Algorithm

33

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 34: A Dynamic Elimination-Combining Stack Algorithm

34

DECS Performance Evaluation Hardware

128-way UltraSparc T2 Plus (T5140) server. A 2 chip system, in which each chip contains 8 cores, and each core multiplexes 8 hardware threads.

Running Solaris 10 OS. The cores in each CPU share the same L2 cache. C++ code compiled with GCC with the –O3 flag.

Compared VS: Treiber stack The HSY elimination-backoff stacks Flat-combining stack

Page 35: A Dynamic Elimination-Combining Stack Algorithm

35

DECS Performance Evaluation Course of experiments

Threads repeatedly apply operations on the stack for a fixed duration of 1 sec, and the resulting throughput is measured, varying the level of concurrency from 1 to 128.

Throughput is measured on both symmetric and asymmetric workloads.

Stacks are pre-populated with enough cells so that pop operations do not operate on an empty stack.

Each data point is the average of 3 runs.

Page 36: A Dynamic Elimination-Combining Stack Algorithm

36

DECS Performance Evaluation

X-axis: threads number

Symmetric workload

Page 37: A Dynamic Elimination-Combining Stack Algorithm

37

DECS Performance Evaluation

X-axis: threads number

Moderately-asymmetric workload

Page 38: A Dynamic Elimination-Combining Stack Algorithm

38

DECS Performance Evaluation

X-axis: threads number

Fully-asymmetric workload

Page 39: A Dynamic Elimination-Combining Stack Algorithm

39

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 40: A Dynamic Elimination-Combining Stack Algorithm

40

NB-DECS DECS is blocking. For some applications non-blocking

implementation may be preferable because it’s more robust to thread failures.

NB-DECS is a lock-free variant of DECS that allows threads that delegated their operations to another thread, and have waited for too long, to cancel their “combining contracts”, and retry their operations.

Page 41: A Dynamic Elimination-Combining Stack Algorithm

41

Outline Concurrent programming terms Motivation Introduction DECS: The Algorithm DECS Performance evaluation NB-DECS Summary

Page 42: A Dynamic Elimination-Combining Stack Algorithm

42

Summary DECS comprises a combining-elimination

layer, therefore benefits from collision of operations of reverse, as well as identical semantics.

Empirical evaluation showed that DECS outperforms all best known stack algorithms for all workloads.

NB-DECS The idea of combining-elimination layer could

be used to efficiently implement other concurrent data-structures.