+ All Categories
Home > Documents > Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority...

Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority...

Date post: 19-Dec-2015
Category:
View: 215 times
Download: 0 times
Share this document with a friend
25
Transactional Memory Yujia Jin
Transcript
Page 1: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Transactional Memory

Yujia Jin

Page 2: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Lock and Problems

• Lock is commonly used with shared data • Priority Inversion

– Lower priority process hold a lock needed by a higher priority process

• Convoy Effect– When lock holder is interrupted, other is forced to wait

• Deadlock– Circular dependence between different processes

acquiring locks, so everyone just wait for locks

Page 3: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Lock-free

• Shared data structure is lock-free if its operations do not require mutual exclusion

- Will not prevent multiple processes operating on the same object

+ avoid lock problems- Existing lock-free techniques use software

and do not perform well against lock counterparts

Page 4: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Transactional Memory

• Use transaction style operations to operate on lock free data

• Allow user to customized read-modify-write operation on multiple, independent words

• Easy to support with hardware, straight forward extensions to conventional multiprocessor cache

Page 5: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Transaction Style

• A finite sequence of machine instruction with– Sequence of reads,– Computation,– Sequence of write and– Commit

• Formal properties– Atomicity, Serializability (~ACID)

Page 6: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Access Instructions

• Load-transactional (LT)– Reads from shared memory into private register

• Load-transactional-exclusive (LTX)– LT + hinting write is coming up

• Store-transactional (ST)– Tentatively write from private register to shared

memory, new value is not visible to other processors till commit

Page 7: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

State Instructions

• Commit– Tries to make tentative write permanent. – Successful if no other processor read its read set or write its

write set – When fails, discard all updates to write set– Return the whether successful or not

• Abort– Discard all updates to write set

• Validate– Return current transaction status– If current status is false, discard all updates to write set

Page 8: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Typical Transaction

/* keep trying */While ( true ) {

/* read variables */v1 = LT ( V1 ); …; vn = LT ( Vn );/* check consistency */if ( ! VALIDATE () ) continue;/* compute new values */compute ( v1, … , vn);/* write tentative values */ ST (v1, V1); … ST(vn, Vn);/* try to commit */if ( COMMIT () ) return result;else backoff;

}

Page 9: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Warning…

• Not intended for database use

• Transactions are short in time

• Transactions are small in dataset

Page 10: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Idea Behind Implementation

• Existing cache protocol detects accessibility conflicts

• Accessibility conflicts ~ transaction conflicts

• Can extended to cache coherent protocols– Includes bus snoopy, directory

Page 11: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Bus Snoopy Example

processor

Regular cache2048 8-byte lines

Direct mapped

Transaction cache64 8-byte lines

Fully associative

bus

• Caches are exclusive• Transaction cache contains tentative writes

without propagating them to other processors

Page 12: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Transaction Cache

• Cache line contains separate transactional tag in addition to coherent protocol tag– Transactional tag state: empty, normal, xcommit, xabort

• Two entries per transaction– Modification write to xabort, set to empty when abort– Xcommit contains the original, set to empty when commits

• Allocation policy order in decreasing favor– Empty entries, normal entries, xcommit entries

• Must guarantee a minimum transaction size

Page 13: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Bus Actions

• T_READ and T_RFO(read for ownership) are added for transactional requests

• Transactional request can be refused by responding BUSY• When BUSY response is received, transaction is aborted

– This prevents deadlock and continual mutual aborts– Can subject to starvation

Page 14: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Processor Actions

• Transaction active (TACTIVE) flag indicate whether a transaction is in progress, set on first transactional operation

• Transaction status (TSTATUS) flag indicate whether a transaction is aborted

Page 15: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

LT Actions

• Check for XABORT entry• If false, check for NORMAL entry

– Switch NORMAL to XABORT and allocate XCOMMIT

• If false, issue T_READ on bus, then allocate XABORT and XCOMMIT

• If T_READ receive BUSY, abort– Set TSTATUS to false– Drop all XABORT entries– Set all XCOMMIT entries to NORMAL– Return random data

Page 16: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

LTX and ST Actions

• Same as LT Except– Use T_RFO on a miss rather than T_READ– For ST, XABORT entry is updated

Page 17: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

More Exciting Actions

• VALIDATE– Return TSTATUS flag

– If false, set TSTATUS true, TACTIVE false

• ABORT– Update cache, set TSTATUS true, TACTIVE false

• COMMIT– Return TSTATUS, set TSTATUS true, TACTIVE false

– Drops all XCOMMIT and changes all XABORT to NORMAL

Page 18: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Snoopy Cache Actions

• Regular cache acts like MESI invalidate, treats READ same as T_READ, RFO same as T_RFO

• Transactional cache– Non-transactional cycle: Acts like regular cache with

NORMAL entries only

– T_READ: If the the entry is valid (share), returns the value

– All other cycle: BUSY

Page 19: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Simulation

• Proteus Simulator• 32 processors• Regular cache

– Direct mapped, 2048 8-byte lines

• Transactional cache– Fully associative, 64 8-byte lines

• Single cycle caches access• 4 cycle memory access• Both snoopy bus and directory are simulated• 2 stage network with switch delay of 1 cycle each

Page 20: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Benchmarks

• Counter– n processors, each increment a shared counter (2^16)/n times

• Producer/Consumer buffer– n/2 processors produce, n/2 processor consume through a shared

FIFO– end when 2^16 items are consumed

• Doubly-linked list– N processors tries to rotate the content from tail to head– End when 2^16 items are moved– Variables shared are conditional– Traditional locking method can introduce deadlock

Page 21: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Comparisons

• Competitors– Transactional memory– Load-locked/store-cond (Alpha)– Spin lock with backoff – Software queue– Hardware queue

Page 22: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Counter Result

Page 23: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Producer/Consumer Result

Page 24: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Doubly Linked List Result

Page 25: Transactional Memory Yujia Jin. Lock and Problems Lock is commonly used with shared data Priority Inversion –Lower priority process hold a lock needed.

Conclusion

• Avoid extra lock variable and lock problems

• Trade dead lock for possible live lock/starvation

• Comparable performance to lock technique when shared data structure is small

• Relatively easy to implement


Recommended