transactionalmemory - university of virginia school of...

41
Transactional Memory 1

Upload: others

Post on 05-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Transactional Memory

1

Page 2: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

To read more…

This day’s papers:Herlihy and Moss, “Transactional Memory: Architectural Support forLock-Free Data Structures”McKenney et al, “Why The Grass May Not Be Greener On The Other Side:A Comparison of Locking vs. Transactional Memory”

Supplementary readings:extended tech report version of Herlihy and Moss: http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-92-7.pdf(includes more details generally, including extension to directory-basedprotocols)

1

Page 3: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Homework 2 questions?

2

Page 4: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

From the paper reviews

Herlihy: benchmarks seemed very biased againstlocks

McKenney: where is quantitative data?

Can/How can locks and TM coexist?

Real-world implementations?

I/O, etc.

3

Page 5: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Herlihy benchmarks

very short critical sections

lots of contention

comparing against coarse-grained locking

didn’t test priority inversion, etc. (motivations?)

4

Page 6: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Locks versus Transactions

McKenney, Table 1 5

Page 7: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Locks versus Transactions [top]

McKenney, Table 1 (top) 6

Page 8: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Locks versus Transactions [bottom]

McKenney, Table 1 (bottom) 7

Page 9: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Transaction properties

serializable — apparently one at a time

atomic — commits or aborts, nothing in between

8

Page 10: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Basic Herlihey and Moss interface

LT — load value as part of transaction

ST — store value as part of transaction

COMMIT — try to make changes

Commit semantics:

caller must retry transaction if it fails

aborts instead if conflicting changes happened toread or written values

9

Page 11: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Weird Herlihey and Moss operation

VALIDATE — is transaction likely to commit?

Is this necessary?

10

Page 12: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Extra Herlihey and Moss operations

I think these all just optimizations…

LTX — load with hint that we will write

ABORT — give up on transaction

11

Page 13: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

the transaction cache

CPU

normal cache

address transaction tag MESI state value1234 discard on commit Modified 1001234 discard on abort Exclusive 1015678 discard on commit Shared 1505678 discard on abort Shared 150… … … …

transaction cache bus

12

Page 14: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

the transcation cache

Extra cache — why?additional logic for transaction commit/abortfully-associativive — conflicts are worse than usual

Also acts as normal cache — analogy to Jouppi’svictim cache

… but only stores things that were part of transactions

13

Page 15: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

transcation cache tagsNormal not part of pending transaction

Discard on Commit pre-transaction version

Discard on Abort transaction modified verison

Invalid

14

Page 16: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

transcation cache

has transaction tags and MESI states!

during transaction — two copies of valuesbefore and after transaction versionmight have the only copy of both!

after transaction — acts like normal cache“normal” tag represents normally cached valuesalso “discard on commit” if transcation cannot commit

15

Page 17: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

TSTATUS

flag: Can we commit?

If true, COMMIT will commit transaction

If false:

LT/LTX (reads) return “arbitrary value”

ST (writes) are discarded

transaction can never commit

16

Page 18: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

aborting a transaction

CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared

CPU2: read for transaction 0x100CPU1: it’s busy!

BUSY — CPU2 aborts transaction

CPU2: read-to-own for transaction 0x101CPU1: it’s busy!

BUSY — CPU2 aborts transaction

17

Page 19: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

aborting a transaction

CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared

CPU2: read for transaction 0x100CPU1: it’s busy!

BUSY — CPU2 aborts transaction

CPU2: read-to-own for transaction 0x101CPU1: it’s busy!

BUSY — CPU2 aborts transaction

17

Page 20: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

aborting a transaction

CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared

CPU2: read for transaction 0x100CPU1: it’s busy!

BUSY — CPU2 aborts transaction

CPU2: read-to-own for transaction 0x101CPU1: it’s busy!

BUSY — CPU2 aborts transaction

17

Page 21: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

aborting a transaction (text)

bus read-for-ownership returns BUSYother transaction LT/LTX/ST same valueother transaction might not commit

bus read (non-exclusive) returns BUSYother transaction LTX/ST same valueother transactoin might not commit

18

Page 22: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

VALIDATE

weird things happen during aborted transaction

VALIDATE tells us if this happened

needed to, e.g., not access invalid pointer:

19

Page 23: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

COMMIT and ABORT

local operations

cache checks “can I commit” flag

changes tags of transaction cache entries only

20

Page 24: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

no gaurentee of progressThread 1 Thread 2 Thread 3t1 = LTX(a) t2 = LTX(b) t3 = LTX(c)ST(b, t1)aborts, restartst1 = LTX(a)

ST(c, t2)aborts, restarts

ST(a, t3)aborts, restarts

t2 = LTX(b) t3 = LTX(c)

21

Page 25: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

transaction and non-transaction

“For brevity, we have chosen not to specify howtranscational and non-transactional operationsinteract when applied concurrently to the samelocation”

22

Page 26: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

costs of transaction support

extra fully associative cachealternative: extra state bits on existing cache… but what about conflicts?… how much extra state??

larger transcations: bigger extra cache/state

23

Page 27: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

transaction overflow: one idea

0419480x 27

1 1 1 1 0 1 0 1 …global mask

if 0: exception!Exception handler:Acquire lock for index 0x04 (or ABORT)Record new/old value in local memoryUpdate value, release lock on COMMIT/ABORTReturn from exception

24

Page 28: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

costs of transaction conflict

25

Page 29: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

costs of transaction conflict

extra work — bus traffic reading/invalidating

extra work — time to abort

locks would delay instead

26

Page 30: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

transaction/lock iteraction option

non-transaction reads/writes abort transaction

… if transcation is also writing/reading it

… including to locks

27

Page 31: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

real transcations

Intel TSX (recent Intel x86 chips):Restricted Transactional Memory (RTM)Hardware Lock Ellision (HLE)

IBM POWER8+

IBM System z (successor to S/370 — mainframes)

28

Page 32: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Restricted Transactional Memory

Intel real transactional memory suppport:

XBEGIN abortDest, XEND — mark transaction

XABORT — explicit abort

jump to abortDest if aborted (no validate)

abort discards all memory and register changes

size limits, I/O? transaction may always abort

29

Page 33: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Intel Hardware Lock Ellision

transactions for spin-locks only

XACQUIRE, XRELEASE — mark critical section

starts transaction reading lock only

ensure conflict with anything using lock normally

if aborted — run without transaction (modify lock)

backwards compatible!

30

Page 34: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Intel TSX Oops

31

Page 35: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Other HTM implementations

generally require software fallback code using locks

common case — lock ellision

IBM POWER8 — transaction suspend/resumeallow system calls/page faults/debugging duringtransactioncontext switch/etc.? transaction aborts on resumealso assists software speculation

32

Page 36: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

HTM limits

Intel Haswell4 MB read set22 KB write set

IBM POWER88 KB read set8 KB write set

Nakaike et al, “Quantitative Comparison of Hardware Transactional Memoryfor Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8”, ISCA’15 33

Page 37: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Next time: Cray-1 and GPUs

Cray-1 — vector processor

very wide registers

designed to optimize loops

programmable GPUs

prereq. to CUDA/etc. (next week)

designed to produce graphics

34

Page 38: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Graphics pipeline

part 1: list of triangles (vertices)figure out color/lightingadjust screen coordinatescompute depth (to hide if object is in front)

part 2: fill triangles (fragment)compute pixels of triangletrack depth of each pixel, replace only if closerbased on settings of vertices (corners)

35

Page 39: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

A User-Programmable VertexEngine

Programmable vertex manipulation only

Seperate, very limited functionality fills in pixelscalled fragment operations

… but based on colors, coordinates, etc. set by code

36

Page 40: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

On Cray-1

paper spends a time on exchange registers, etc.

old alternative to virtual memory

not important for us

37

Page 41: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234

Logistics: Homework 3 Accounts?

38