mama: mostly automatic management of atomicity

27
MAMA: Mostly Automatic Management of Atomicity Christian DeLozier, Joseph Devietti, Milo M. K. Martin University of Pennsylvania March 2 nd , 2014

Upload: dalit

Post on 08-Jan-2016

31 views

Category:

Documents


1 download

DESCRIPTION

MAMA: Mostly Automatic Management of Atomicity. Christian DeLozier, Joseph Devietti, Milo M. K. Martin University of Pennsylvania. March 2 nd , 2014. Start with a serial problem. Find and express the parallelism. Coordinate the parallel execution (synchronization). Don’t mess up!. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MAMA: Mostly Automatic Management of Atomicity

MAMA: Mostly Automatic Management of Atomicity

Christian DeLozier, Joseph Devietti, Milo M. K. MartinUniversity of Pennsylvania

March 2nd, 2014

Page 2: MAMA: Mostly Automatic Management of Atomicity

2

Start with a serial problem

Christian DeLozier - 2014

Page 3: MAMA: Mostly Automatic Management of Atomicity

3

Find and express the parallelism

Christian DeLozier - 2014

Page 4: MAMA: Mostly Automatic Management of Atomicity

4

Coordinate the parallel execution (synchronization)

Christian DeLozier - 2014

Page 5: MAMA: Mostly Automatic Management of Atomicity

5

Don’t mess up!

Christian DeLozier - 2014

Page 6: MAMA: Mostly Automatic Management of Atomicity

6

Is there another way to do this?

• Programmer currently has to:1. Express the parallelism (Hard)

2. Coordinate the parallelism (Hard)

• Alternative:1. Programmer expresses the parallelism

2. Machine handles coordination

Christian DeLozier - 2014

Page 7: MAMA: Mostly Automatic Management of Atomicity

7

Coordinating Parallel Execution

• Atomicity vs. Ordering• Types of concurrency bugs [Lu et al., ASPLOS 2008]

• Atomicity: Locks, transactions• Ordering: Barriers, fork/join, blocking on a queue, etc.

• Atomicity constraints are more common than ordering constraints

• Difficult to infer ordering constraints

Christian DeLozier - 2014

Page 8: MAMA: Mostly Automatic Management of Atomicity

8

Mostly Automatic Management of Atomicity

• Toward automatically providing atomicity for parallel programs

• Program either executes atomically – or deadlocks

• Protect every shared variable with its own lock

• Restore progress and performance when necessary (with help from the programmer)

Christian DeLozier - 2014

Page 9: MAMA: Mostly Automatic Management of Atomicity

9

Related Work

• Automatic Parallelization• [Bernstein, IEEE Transactions 1966]• …

• Data Centric Synchronization• [Vaziri et. al, POPL 2006]• [Ceze et. al, HPCA 2007]

• Transactional Memory• [Herlihy and Moss, ISCA 1993]• …

Christian DeLozier - 2014

Page 10: MAMA: Mostly Automatic Management of Atomicity

10

Lock-Based Atomic Sections

• What lock do we acquire?

• When do we acquire the lock?

• When should we release the lock?

Christian DeLozier - 2014

Page 11: MAMA: Mostly Automatic Management of Atomicity

11

What lock do we acquire?

• Associate a lock with each variable

• Trade-off between parallelism and overhead

• Coarse-grained vs. Fine-grained• Coarse-grained: 1 lock per object, 1 lock per array• Fine-grained: 1 lock per field, 1 lock per array element

• Mutex vs. Reader-writer lock

Christian DeLozier - 2014

Page 12: MAMA: Mostly Automatic Management of Atomicity

12

MAMA Prototype

• Uses fine-grained locking• More parallelism

• Especially for arrays• Optimization: Divide arrays into N chunks, 1 lock per

chunk

• Uses reader-writer locks• More parallelism

• Read sharing is common

Christian DeLozier - 2014

Page 13: MAMA: Mostly Automatic Management of Atomicity

13

Lock-Based Atomic Sections

• What lock do we acquire?• One reader-writer lock per variable (fine-grained)

• When do we acquire the lock?• Acquire before the first dynamic access

• When should we release the lock?

Christian DeLozier - 2014

Page 14: MAMA: Mostly Automatic Management of Atomicity

14

T1T1

When should we release the lock?

• Simple case: After the owning thread has exited

Christian DeLozier - 2014

T1 T2

Write A

Exit

Write A

T2T2

Exited

Write A

Page 15: MAMA: Mostly Automatic Management of Atomicity

15

When should we release the lock?

• When the owning thread is waiting for another thread to make progress (e.g. join, barrier)

Christian DeLozier - 2014

T1T1

T1 T2

Write A

Join T2

Write A T2T2

Joined

Write A

Exit

Spawn T2

Page 16: MAMA: Mostly Automatic Management of Atomicity

16

When should we release the lock?

• Other deadlocks cannot be safely broken

• Need help from the programmer• Trusted annotations to sanction breaking a deadlock

• MAMA_release(object)• Also used to improve performance when threads are

over-serializedChristian DeLozier - 2014

T1T1

T1 T2

Write A

Write B

Write B T2T2

Write B

Write A

Write A

Page 17: MAMA: Mostly Automatic Management of Atomicity

17

Lock-Based Atomic Sections

• What lock do we acquire?• One reader-writer lock per variable (fine-grained)

• When do we acquire the lock?• Acquire before the first dynamic access

• When should we release the lock?• At thread exit• When waiting for another thread to make progress• Or, at programmer sanctioned program points

Christian DeLozier - 2014

Page 18: MAMA: Mostly Automatic Management of Atomicity

18

What can deadlocks tell us?• When a thread cannot acquire a lock:

• Perform distributed deadlock detection [Bracha and Toueg, Distributed Computing 1987]

Christian DeLozier - 2014

void f(){ A = 1; B = 2;}

void g(){ B = 1; A = 2;}

T1T1 T2T2

Write B

Write AT2

T1

Page 19: MAMA: Mostly Automatic Management of Atomicity

19

MAMA Prototype

• Implemented as a RoadRunner tool [Flanagan and Freund, PASTE 2010]

• Dynamic instrumentation for Java byte-code

• Evaluated on the Java Grande benchmarks and selected DaCapo benchmarks

• Running on one socket (8 cores) of a 4 socket Nehalem system with 128 GB RAM

• Removed all synchronized blocks and java.util.concurrent constructs from benchmarks

• Ensure that MAMA is providing all of the atomicity

Christian DeLozier - 2014

Page 20: MAMA: Mostly Automatic Management of Atomicity

20

Evaluating MAMA

• Can we execute parallel programs correctly?

• How many annotations need to be added for progress and performance?

• How is the performance of the program affected?• Does MAMA permit thread to execute in parallel?

Christian DeLozier - 2014

Page 21: MAMA: Mostly Automatic Management of Atomicity

21

Benchmark Lines of CodeProgress

AnnotationsPerformance Annotations

crypt 314 0 0

lufact 461 1 4

lusearch 124105 0 4

matmult 187 0 0

moldyn 487 3 0

montecarlo 1165 0 28

pmd 60062 0 4

series 180 0 0

sor 186 1 0

sunflow 21970 1 3

xalan 172300 0 0

Annotation Burden

Christian DeLozier - 2014

Page 22: MAMA: Mostly Automatic Management of Atomicity

22

Performance

Christian DeLozier - 2014

• MAMA incurs overhead due to locking and serial execution• But, MAMA still allows some parallel execution as

compared to serialization

23x

Page 23: MAMA: Mostly Automatic Management of Atomicity

23

Performance Breakdown

Christian DeLozier - 2014

• Many benchmarks have significant portions that run in parallel• Checking whether or not a lock is already owned incurs

significant overhead on some benchmarks

Page 24: MAMA: Mostly Automatic Management of Atomicity

24

Memory Usage

Christian DeLozier - 2014

• Fine-grained locking incurs significant memory overheads• Could be optimized to save space via chunking arrays or

decreasing the size of the lock

Page 25: MAMA: Mostly Automatic Management of Atomicity

25

Future Directions

• Does this approach apply to other languages?

• How do we test programs running with MAMA?• Find uncommon deadlocks• Gain more confidence in trusted annotations

• How can we reduce the performance overheads?

• How can we infer ordering constraints?

Christian DeLozier - 2014

Page 26: MAMA: Mostly Automatic Management of Atomicity

26

MAMA

• Provides atomicity for parallel programs• Some help via annotations from programmer

• A step toward programming without worrying about atomicity

• Programmer expresses parallelism• Machine provides atomicity automatically

Christian DeLozier - 2014

Page 27: MAMA: Mostly Automatic Management of Atomicity

27

Thank you for listening!

Christian DeLozier - 2014