transactionalmemory - university of virginia school of...
TRANSCRIPT
![Page 1: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/1.jpg)
Transactional Memory
1
![Page 2: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/2.jpg)
To read more…
This day’s papers:Herlihy and Moss, “Transactional Memory: Architectural Support forLock-Free Data Structures”McKenney et al, “Why The Grass May Not Be Greener On The Other Side:A Comparison of Locking vs. Transactional Memory”
Supplementary readings:extended tech report version of Herlihy and Moss: http://www.hpl.hp.com/techreports/Compaq-DEC/CRL-92-7.pdf(includes more details generally, including extension to directory-basedprotocols)
1
![Page 3: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/3.jpg)
Homework 2 questions?
2
![Page 4: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/4.jpg)
From the paper reviews
Herlihy: benchmarks seemed very biased againstlocks
McKenney: where is quantitative data?
Can/How can locks and TM coexist?
Real-world implementations?
I/O, etc.
3
![Page 5: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/5.jpg)
Herlihy benchmarks
very short critical sections
lots of contention
comparing against coarse-grained locking
didn’t test priority inversion, etc. (motivations?)
4
![Page 6: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/6.jpg)
Locks versus Transactions
McKenney, Table 1 5
![Page 7: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/7.jpg)
Locks versus Transactions [top]
McKenney, Table 1 (top) 6
![Page 8: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/8.jpg)
Locks versus Transactions [bottom]
McKenney, Table 1 (bottom) 7
![Page 9: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/9.jpg)
Transaction properties
serializable — apparently one at a time
atomic — commits or aborts, nothing in between
8
![Page 10: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/10.jpg)
Basic Herlihey and Moss interface
LT — load value as part of transaction
ST — store value as part of transaction
COMMIT — try to make changes
Commit semantics:
caller must retry transaction if it fails
aborts instead if conflicting changes happened toread or written values
9
![Page 11: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/11.jpg)
Weird Herlihey and Moss operation
VALIDATE — is transaction likely to commit?
Is this necessary?
10
![Page 12: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/12.jpg)
Extra Herlihey and Moss operations
I think these all just optimizations…
LTX — load with hint that we will write
ABORT — give up on transaction
11
![Page 13: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/13.jpg)
the transaction cache
CPU
normal cache
address transaction tag MESI state value1234 discard on commit Modified 1001234 discard on abort Exclusive 1015678 discard on commit Shared 1505678 discard on abort Shared 150… … … …
transaction cache bus
12
![Page 14: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/14.jpg)
the transcation cache
Extra cache — why?additional logic for transaction commit/abortfully-associativive — conflicts are worse than usual
Also acts as normal cache — analogy to Jouppi’svictim cache
… but only stores things that were part of transactions
13
![Page 15: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/15.jpg)
transcation cache tagsNormal not part of pending transaction
Discard on Commit pre-transaction version
Discard on Abort transaction modified verison
Invalid
14
![Page 16: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/16.jpg)
transcation cache
has transaction tags and MESI states!
during transaction — two copies of valuesbefore and after transaction versionmight have the only copy of both!
after transaction — acts like normal cache“normal” tag represents normally cached valuesalso “discard on commit” if transcation cannot commit
15
![Page 17: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/17.jpg)
TSTATUS
flag: Can we commit?
If true, COMMIT will commit transaction
If false:
LT/LTX (reads) return “arbitrary value”
ST (writes) are discarded
transaction can never commit
16
![Page 18: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/18.jpg)
aborting a transaction
CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared
CPU2: read for transaction 0x100CPU1: it’s busy!
BUSY — CPU2 aborts transaction
CPU2: read-to-own for transaction 0x101CPU1: it’s busy!
BUSY — CPU2 aborts transaction
17
![Page 19: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/19.jpg)
aborting a transaction
CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared
CPU2: read for transaction 0x100CPU1: it’s busy!
BUSY — CPU2 aborts transaction
CPU2: read-to-own for transaction 0x101CPU1: it’s busy!
BUSY — CPU2 aborts transaction
17
![Page 20: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/20.jpg)
aborting a transaction
CPU1 CPU2 MEM1address tag state0x100 Discard on Abort Modified0x100 Discard on Commit Exclusive0x101 Discard on Abort Shared0x101 Discard on Commit Shared
CPU2: read for transaction 0x100CPU1: it’s busy!
BUSY — CPU2 aborts transaction
CPU2: read-to-own for transaction 0x101CPU1: it’s busy!
BUSY — CPU2 aborts transaction
17
![Page 21: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/21.jpg)
aborting a transaction (text)
bus read-for-ownership returns BUSYother transaction LT/LTX/ST same valueother transaction might not commit
bus read (non-exclusive) returns BUSYother transaction LTX/ST same valueother transactoin might not commit
18
![Page 22: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/22.jpg)
VALIDATE
weird things happen during aborted transaction
VALIDATE tells us if this happened
needed to, e.g., not access invalid pointer:
19
![Page 23: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/23.jpg)
COMMIT and ABORT
local operations
cache checks “can I commit” flag
changes tags of transaction cache entries only
20
![Page 24: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/24.jpg)
no gaurentee of progressThread 1 Thread 2 Thread 3t1 = LTX(a) t2 = LTX(b) t3 = LTX(c)ST(b, t1)aborts, restartst1 = LTX(a)
ST(c, t2)aborts, restarts
ST(a, t3)aborts, restarts
t2 = LTX(b) t3 = LTX(c)
21
![Page 25: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/25.jpg)
transaction and non-transaction
“For brevity, we have chosen not to specify howtranscational and non-transactional operationsinteract when applied concurrently to the samelocation”
22
![Page 26: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/26.jpg)
costs of transaction support
extra fully associative cachealternative: extra state bits on existing cache… but what about conflicts?… how much extra state??
larger transcations: bigger extra cache/state
23
![Page 27: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/27.jpg)
transaction overflow: one idea
0419480x 27
1 1 1 1 0 1 0 1 …global mask
if 0: exception!Exception handler:Acquire lock for index 0x04 (or ABORT)Record new/old value in local memoryUpdate value, release lock on COMMIT/ABORTReturn from exception
24
![Page 28: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/28.jpg)
costs of transaction conflict
25
![Page 29: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/29.jpg)
costs of transaction conflict
extra work — bus traffic reading/invalidating
extra work — time to abort
locks would delay instead
26
![Page 30: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/30.jpg)
transaction/lock iteraction option
non-transaction reads/writes abort transaction
… if transcation is also writing/reading it
… including to locks
27
![Page 31: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/31.jpg)
real transcations
Intel TSX (recent Intel x86 chips):Restricted Transactional Memory (RTM)Hardware Lock Ellision (HLE)
IBM POWER8+
IBM System z (successor to S/370 — mainframes)
28
![Page 32: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/32.jpg)
Restricted Transactional Memory
Intel real transactional memory suppport:
XBEGIN abortDest, XEND — mark transaction
XABORT — explicit abort
jump to abortDest if aborted (no validate)
abort discards all memory and register changes
size limits, I/O? transaction may always abort
29
![Page 33: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/33.jpg)
Intel Hardware Lock Ellision
transactions for spin-locks only
XACQUIRE, XRELEASE — mark critical section
starts transaction reading lock only
ensure conflict with anything using lock normally
if aborted — run without transaction (modify lock)
backwards compatible!
30
![Page 34: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/34.jpg)
Intel TSX Oops
31
![Page 35: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/35.jpg)
Other HTM implementations
generally require software fallback code using locks
common case — lock ellision
IBM POWER8 — transaction suspend/resumeallow system calls/page faults/debugging duringtransactioncontext switch/etc.? transaction aborts on resumealso assists software speculation
32
![Page 36: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/36.jpg)
HTM limits
Intel Haswell4 MB read set22 KB write set
IBM POWER88 KB read set8 KB write set
Nakaike et al, “Quantitative Comparison of Hardware Transactional Memoryfor Blue Gene/Q, zEnterprise EC12, Intel Core, and POWER8”, ISCA’15 33
![Page 37: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/37.jpg)
Next time: Cray-1 and GPUs
Cray-1 — vector processor
very wide registers
designed to optimize loops
programmable GPUs
prereq. to CUDA/etc. (next week)
designed to produce graphics
34
![Page 38: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/38.jpg)
Graphics pipeline
part 1: list of triangles (vertices)figure out color/lightingadjust screen coordinatescompute depth (to hide if object is in front)
part 2: fill triangles (fragment)compute pixels of triangletrack depth of each pixel, replace only if closerbased on settings of vertices (corners)
35
![Page 39: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/39.jpg)
A User-Programmable VertexEngine
Programmable vertex manipulation only
Seperate, very limited functionality fills in pixelscalled fragment operations
… but based on colors, coordinates, etc. set by code
36
![Page 40: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/40.jpg)
On Cray-1
paper spends a time on exchange registers, etc.
old alternative to virtual memory
not important for us
37
![Page 41: TransactionalMemory - University of Virginia School of ...cr4bd/6354/F2016/slides/lec17-slides-1up.pdfthetransactioncache CPU normalcache addresstransactiontag MESIstatevalue 1234](https://reader034.vdocument.in/reader034/viewer/2022050423/5f9297d69bbc8476d409f050/html5/thumbnails/41.jpg)
Logistics: Homework 3 Accounts?
38