transactional memory yujia jin. lock and problems lock is commonly used with shared data priority...

25
Transactional Memory Yujia Jin

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Transactional Memory

Yujia Jin

Page 2: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Lock and Problems

• Lock is commonly used with shared data • Priority Inversion

– Lower priority process hold a lock needed by a higher priority process

• Convoy Effect– When lock holder is interrupted, other is forced to wait

• Deadlock– Circular dependence between different processes

acquiring locks, so everyone just wait for locks

Page 3: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Lock-free

• Shared data structure is lock-free if its operations do not require mutual exclusion

- Will not prevent multiple processes operating on the same object

+ avoid lock problems- Existing lock-free techniques use software

and do not perform well against lock counterparts

Page 4: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Transactional Memory

• Use transaction style operations to operate on lock free data

• Allow user to customized read-modify-write operation on multiple, independent words

• Easy to support with hardware, straight forward extensions to conventional multiprocessor cache

Page 5: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Transaction Style

• A finite sequence of machine instruction with– Sequence of reads,– Computation,– Sequence of write and– Commit

• Formal properties– Atomicity, Serializability (~ACID)

Page 6: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Access Instructions

• Load-transactional (LT)– Reads from shared memory into private register

• Load-transactional-exclusive (LTX)– LT + hinting write is coming up

• Store-transactional (ST)– Tentatively write from private register to shared

memory, new value is not visible to other processors till commit

Page 7: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

State Instructions

• Commit– Tries to make tentative write permanent. – Successful if no other processor read its read set or write its

write set – When fails, discard all updates to write set– Return the whether successful or not

• Abort– Discard all updates to write set

• Validate– Return current transaction status– If current status is false, discard all updates to write set

Page 8: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Typical Transaction

/* keep trying */While ( true ) {

/* read variables */v1 = LT ( V1 ); …; vn = LT ( Vn );/* check consistency */if ( ! VALIDATE () ) continue;/* compute new values */compute ( v1, … , vn);/* write tentative values */ ST (v1, V1); … ST(vn, Vn);/* try to commit */if ( COMMIT () ) return result;else backoff;

}

Page 9: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Warning…

• Not intended for database use

• Transactions are short in time

• Transactions are small in dataset

Page 10: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Idea Behind Implementation

• Existing cache protocol detects accessibility conflicts

• Accessibility conflicts ~ transaction conflicts

• Can extended to cache coherent protocols– Includes bus snoopy, directory

Page 11: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Bus Snoopy Example

processor

Regular cache2048 8-byte lines

Direct mapped

Transaction cache64 8-byte lines

Fully associative

bus

• Caches are exclusive• Transaction cache contains tentative writes

without propagating them to other processors

Page 12: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Transaction Cache

• Cache line contains separate transactional tag in addition to coherent protocol tag– Transactional tag state: empty, normal, xcommit, xabort

• Two entries per transaction– Modification write to xabort, set to empty when abort– Xcommit contains the original, set to empty when commits

• Allocation policy order in decreasing favor– Empty entries, normal entries, xcommit entries

• Must guarantee a minimum transaction size

Page 13: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Bus Actions

• T_READ and T_RFO(read for ownership) are added for transactional requests

• Transactional request can be refused by responding BUSY• When BUSY response is received, transaction is aborted

– This prevents deadlock and continual mutual aborts– Can subject to starvation

Page 14: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Processor Actions

• Transaction active (TACTIVE) flag indicate whether a transaction is in progress, set on first transactional operation

• Transaction status (TSTATUS) flag indicate whether a transaction is aborted

Page 15: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

LT Actions

• Check for XABORT entry• If false, check for NORMAL entry

– Switch NORMAL to XABORT and allocate XCOMMIT

• If false, issue T_READ on bus, then allocate XABORT and XCOMMIT

• If T_READ receive BUSY, abort– Set TSTATUS to false– Drop all XABORT entries– Set all XCOMMIT entries to NORMAL– Return random data

Page 16: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

LTX and ST Actions

• Same as LT Except– Use T_RFO on a miss rather than T_READ– For ST, XABORT entry is updated

Page 17: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

More Exciting Actions

• VALIDATE– Return TSTATUS flag

– If false, set TSTATUS true, TACTIVE false

• ABORT– Update cache, set TSTATUS true, TACTIVE false

• COMMIT– Return TSTATUS, set TSTATUS true, TACTIVE false

– Drops all XCOMMIT and changes all XABORT to NORMAL

Page 18: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Snoopy Cache Actions

• Regular cache acts like MESI invalidate, treats READ same as T_READ, RFO same as T_RFO

• Transactional cache– Non-transactional cycle: Acts like regular cache with

NORMAL entries only

– T_READ: If the the entry is valid (share), returns the value

– All other cycle: BUSY

Page 19: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Simulation

• Proteus Simulator• 32 processors• Regular cache

– Direct mapped, 2048 8-byte lines

• Transactional cache– Fully associative, 64 8-byte lines

• Single cycle caches access• 4 cycle memory access• Both snoopy bus and directory are simulated• 2 stage network with switch delay of 1 cycle each

Page 20: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Benchmarks

• Counter– n processors, each increment a shared counter (2^16)/n times

• Producer/Consumer buffer– n/2 processors produce, n/2 processor consume through a shared

FIFO– end when 2^16 items are consumed

• Doubly-linked list– N processors tries to rotate the content from tail to head– End when 2^16 items are moved– Variables shared are conditional– Traditional locking method can introduce deadlock

Page 21: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Comparisons

• Competitors– Transactional memory– Load-locked/store-cond (Alpha)– Spin lock with backoff – Software queue– Hardware queue

Page 22: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Counter Result

Page 23: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Producer/Consumer Result

Page 24: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Doubly Linked List Result

Page 25: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed

Conclusion

• Avoid extra lock variable and lock problems

• Trade dead lock for possible live lock/starvation

• Comparable performance to lock technique when shared data structure is small

• Relatively easy to implement