decs: a dynamic elimination-combining stack algorithm

Post on 23-Feb-2016

38 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

DECS: A Dynamic Elimination-Combining Stack Algorithm. Gal Bar-Nissan, Danny Hendler , Adi Suissa. OPODIS 2011. Stack data-structure. We focus on the stack data-structure which supports two operations: push(v) – adds a new element (with value v) to the top of the stack - PowerPoint PPT Presentation

TRANSCRIPT

1

DECS: A Dynamic Elimination-

Combining Stack Algorithm

Gal Bar-Nissan,Danny Hendler,

Adi Suissa

OPODIS 2011

2

Stack data-structureWe focus on the stack data-structure

which supports two operations:◦push(v) – adds a new element (with

value v) to the top of the stack◦pop – removes the top element from

the stack and returns it

3

Previous work – IBM/Treiber algorithm [1986]

Linked-list basedShared top pointer

next

next

top

nextnew

old

top

new

top

push operation

pop operation

Non-blocking algorithm Poor scalability (essentially sequential)

4

Previous work – Flat-combining [Hendler, Incze, Shavit, Tzafrir, 2010]

A list of operations to be performed

Each thread adds its operation to the listOne of the threads acquires a global lock

and performs the combined operationOther threads spin and wait for their

operation to be performed

push

pop push

push

pop push

Minimizes synchronization

Blocking algorithm Limited scalability (essentially sequential)

5

Previous work – Elimination Backoff (HSY) [Hendler, Shavit, Yerushalmi, 2004]

Eliminating reverse semantics operations

A thread attempts its operation:1. On the central stack (IBM/Treiber algorithm)2. Elimination Backoff – Eliminate with another

threadT1

T2

T3

pop

push( )

pop

T1

Central Stack

Non-blocking algorithm Provides parallelism – if workloads are symmetric

6

Our contributionsDECS – A Dynamic Elimination-

Combining Stack algorithm

Dynamically employs either of two techniques:1. Elimination2. Combining

A non-blocking version (NB-DECS)

7

DECS – Dynamic Elimination-Combining StackEmploys IMB/Treiber’s algorithm as a

central stack

A thread attempts its operation:1. On the central stack2. Elimination-Combining Backoff – Eliminate or

Combine with another thread

8

Elimination-Combining layer

A thread attempts its operation on the central

stack

1T1

Central Stack

op1

If that fails, it registers itself in a publication array

2

T1

It then chooses a random index from the publication array, and looks for another

thread

3

If no other thread is found, the thread waits

9

Elimination-Combining layer (cont'd)

4T2

Central Stack

op2

T1 T2

If it finds another thread with a reverse semantics

operation, the operations are eliminated

5

op1 != op2

Another thread that fails operating on the central

stack also registers in the array and tries to find

Another thread

10

If both threads have identical operation

semantics, one thread delegates its operation to

the other thread

6T2

Central Stack

op2

T1 T2

op1 == op2

T1

Elimination-Combining layer (cont'd)

delegate thread

11

Multi-PushT1

Central Stack

push

T1 Ta Tb

12

Multi-PopT1

Central Stack

pop

T1 Ta Tb Tc

M

M = min{stack_size, multi_op_size}

13

Multi-Eliminate

T1

pushT1 Ta Tb

T2 Tc Td TeT2

pop

Retry!

14

Data-structures

Push & Pop operations

MultiPop function

17

Collide function

18

ActiveCollide, Combine functions

19

MultiEliminate function

20

PassiveCollide

21

Experimental EvaluationEvaluated on an UltraSPARC T2+ –

8 cores CPU (each with 8 hardware threads) 64 hardware threads

Compared DECS with:◦ Treiber (with exponential backoff)◦ HSY (elimination backoff) algorithm◦ Flat-Combining (FC) stack

22

Symmetric workload50% push – 50% pop

Threads

Thro

ughp

ut

23

Moderately Asymmetric75% push – 25% pop

Threads

Thro

ughp

ut

24

Fully Asymmetric100% push – 0% pop

Threads

Thro

ughp

ut

25

DECS summary Scalable

Provides parallelism even for asymmetric workloads

Blocking

26

Non-blocking DECSA non-blocking algorithm is more

robust to thread failures

Similar to DECS, but threads that delegate an operation do not wait indefinitely

A thread stops waiting by signaling its delegate thread

27

NB-DECS - exampleA thread may stop waiting

after some timeoutT1

Central Stack

push

T1 Ta TbX

X

28

NB-DECS - overhead1. Test-and-set validation of each

popped element from the central stack

2. Elements must be popped from the central stack one-by-one

3. Test-and-set validation on eliminated operations

29

Symmetric workload50% push – 50% pop

Threads

Thro

ughp

ut

30

Moderately Asymmetric75% push – 25% pop

Threads

Thro

ughp

ut

31

Moderately Asymmetric25% push – 75% pop

Threads

Thro

ughp

ut

top related