pay-to-use strong atomicity on conventional hardware

20
Pay-to-use strong atomicity on conventional hardware Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research

Upload: yered

Post on 22-Feb-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Pay-to-use strong atomicity on conventional hardware. Martín Abadi, Tim Harris, Mojtaba Mehrara Microsoft Research. Strong semantics atomic, retry, ..... W hat, ideally, should these constructs do?. Programming discipline(s) W hat does it mean for a program to use the constructs correctly?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Pay-to-use strong atomicity on conventional hardware

Pay-to-use strong atomicity on conventional hardware

Martín Abadi, Tim Harris, Mojtaba Mehrara

Microsoft Research

Page 2: Pay-to-use strong atomicity on conventional hardware

Our approachStrong semantics

atomic, retry, ..... What, ideally, should these constructs do?

Programming discipline(s) What does it mean for a

program to use the constructs correctly?

Low-level semantics & actual implementations

Transactions, optimistic concurrency, program transformations, weak

memory models, ...

Page 3: Pay-to-use strong atomicity on conventional hardware

Programming disciplines

All programs

Violation-freeprograms

Obeying dynamic separation

Obeying static separation

More implementation flexibility

More programs correctly synchronized

• Which programs are correctly synchronized?

Page 4: Pay-to-use strong atomicity on conventional hardware

Strong atomicity• Direct accesses work like single-access

transactions• We would like:– Implementation flexibility; ongoing innovation in

STM/hybrid techniques, optimizations, ...• Invisible / visible readers• In-place / deferred updates• Eager / lazy conflict detection

– No overhead on direct accesses– Robust performance, not dependent on success of

static analyses

Page 5: Pay-to-use strong atomicity on conventional hardware

Strong atomicity: implementation

Physicaladdress

space

Virtual address

space

Tx-heapDirect-heap

Direct memory accesses

Memory accesses

from atomic blocks

Page 6: Pay-to-use strong atomicity on conventional hardware

Writes from atomic blocksPhysicaladdress

space

Virtual address

space

Tx-heapDirect-heap

Direct memory accesses

Memory accesses

from atomic blocks

1. Atomic block attempts to write to a field of an

object

Page 7: Pay-to-use strong atomicity on conventional hardware

Writes from atomic blocksPhysicaladdress

space

Virtual address

space

Tx-heapDirect-heap

Direct memory accesses

Memory accesses

from atomic blocks

2. Revoke direct access to the page holding the direct view of the object

Page 8: Pay-to-use strong atomicity on conventional hardware

Writes from atomic blocksPhysicaladdress

space

Virtual address

space

Tx-heapDirect-heap

Direct memory accesses

Memory accesses

from atomic blocks

3. Use underlying STM write primitives

Page 9: Pay-to-use strong atomicity on conventional hardware

Writes from atomic blocksPhysicaladdress

space

Virtual address

space

Tx-heapDirect-heap

Direct memory accesses

Memory accesses

from atomic blocks

4. Restore direct access once the underlying

transaction has finished and an access violation

(AV) occurs

Page 10: Pay-to-use strong atomicity on conventional hardware

Avoiding Access Violations1. Safe accesses in runtime system

code– Virtual method tables and array length–Memory allocation structures (e.g. free

list)– STM implementation structures– GC implementation

Forward all these to TX-

heap at compile time

Page 11: Pay-to-use strong atomicity on conventional hardware

Avoiding Access Violations2. Safe accesses in normal code – Normal writes to locations that haven’t been

read or written in a TX– Normal reads from locations that

haven’t been written in a TX3. Safe accesses in TX code – TX writes to locations that haven’t been read or

written outside TXs– TX reads from locations that haven’t been

written outside TXs

Forward to TX-heap

Avoid page-level

tracking

Page 12: Pay-to-use strong atomicity on conventional hardware

Sample Codeprivate int ComputeUniqueSegments (int nthreads) { int numUniqueSegment = 0;

for (int i = 0; i < nthreads; i++) numUniqueSegment += this.uniqueSegments[i].Count; return numUniqueSegment; }Genome_Sequencer_ComputeUniqueSegments::loop: mov eax,dword ptr [edi+0x20] // Load uniqueSegments array reference cmp ebx,dword ptr [eax+0x4] // Check reference with array bounds jae outOfRange mov ecx,dword ptr [eax+ebx*4+0x08] // load array element mov eax,dword ptr [ecx] // load Count function pointer call dword ptr [eax+0x88] // call Count (get) function add ebp,eax // add it to numUniqueSegments add ebx,1 cmp ebx,esi jl loop

Access immutable runtime-

system datacmp ebx,dword ptr [eax+0x40000004] // Check reference with array bounds

mov eax,dword ptr [ecx+0x40000000] // load Count function pointercall dword ptr [eax+0x40000088] // call Count (get) function

mov ecx,dword ptr [eax+ebx*4+0x40000008] // load array element

mov eax,dword ptr [edi+0x40000020] // Load uniqueSegments array reference

Safe normal access

Page 13: Pay-to-use strong atomicity on conventional hardware

Exploiting Safe Accesses• Implemented by extending Steensgard’s

points-to analysis• Only safe accesses from normal code were

beneficial• Little benefit from identifying safe accesses from

inside atomic blocks. #page-table changes:Genome Delaunay Labyrinth Vacation

Before 31 K 43 147 41 K

After 31 K 39 36 38 K

Ratio 99% 90% 36% 92 %

Page 14: Pay-to-use strong atomicity on conventional hardware

Patching access violations• Patch sites of AVs• Our heuristic:– Patch on first AV– Also change page protection as normal

• Future work:– Remove patches if they become unnecessary–Make multiple patches to bound worst-case

perf

Page 15: Pay-to-use strong atomicity on conventional hardware

Results - Vacation

WA

SA, co

nservative

+ analy

sis

SA, h

andle AVs

+ analy

sis

SA, p

atch AVs

+ analy

sis0

1

2

3

4

5

6

7

8

9

10

Exec

ution

tim

e (s

)

Page 16: Pay-to-use strong atomicity on conventional hardware

Results - Delaunay

WA SA, conservative + analysis SA, handle AVs + analysis SA, patch AVs + analysis0

1

2

3

4

5

6

7

Exec

ution

tim

e (s

)

Page 17: Pay-to-use strong atomicity on conventional hardware

Results - Genome

WA

SA, co

nservative

+ analy

sis

SA, h

andle AVs

+ analy

sis

SA, p

atch AVs

+ analy

sis0

0.5

1

1.5

2

2.5

3

Exec

ution

tim

e (s

)

Page 18: Pay-to-use strong atomicity on conventional hardware

Results - Labyrinth

WA

SA, co

nservative

+ analy

sis

SA, h

andle AVs

+ analy

sis

SA, p

atch AVs

+ analy

sis7.8

8

8.2

8.4

8.6

8.8

9

9.2

Exec

ution

tim

e (s

)

Page 19: Pay-to-use strong atomicity on conventional hardware

Scaling

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

1.2

Labyrinth

#Threads

Nor

mal

ized

exec

ution

tim

e

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

1.2

Vacation

#Threads

Nor

mal

ized

exec

ution

tim

e

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

1.2

Delaunay

#Threads

Nor

mal

ized

exec

ution

tim

e

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

1.2

Genome

#Threads

Nor

mal

ized

exec

ution

tim

e

SA – patch AV + analysisWA

Page 20: Pay-to-use strong atomicity on conventional hardware

Conclusion• Weak atomicity is an obstacle in

providing clear semantics for TM models• We use conventional memory protection

hardware to provide strong atomicity• This comes at a low performance cost…

high runtime complexity cost• Performance hit can be lowered by

compile time analysis