+ All Categories
Home > Documents > Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony...

Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony...

Date post: 05-Jan-2016
Category:
Upload: dennis-osborne
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
29
Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs Intel Labs Intel Labs
Transcript
Page 1: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory

Sanjeev Kumar,Michael Chu,

Christopher Hughes,Partha Kundu,

Anthony Nguyen,

Intel LabsUniversity of MichiganIntel LabsIntel LabsIntel Labs

Page 2: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 2Intel Labs

Promise of Transactional Memory (TM)

1 Easier to programCompose naturally

2 Easier to get parallelperformance

3 No deadlocks

4 Maintain consistency in the presence of errors

5 Avoid priority inversion and convoying

6 Supports fault tolerance

transaction { A = A – 10; B = B + 10;

}

lock(l1); lock(l2); A = A – 10; B = B + 10;

unlock(l1); unlock(l2);

Simplify Parallel Programming

...if ( error ) abort_transaction;

...if ( error ) recovery_code();

Page 3: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 3Intel Labs

Flavors of Transactional Memory

1 Easier to programCompose naturally

2 Easier to get parallelperformance

3 No deadlocks

4 Maintain consistency in the presence of errors

5 Avoid priority inversion and convoying

6 Supports fault tolerance

Our Work: Efficient support for a TM that supports all these features

Basic

Support programmer abort

Support nonblocking

Page 4: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 4Intel Labs

TM ImplementationsRequires versioning support and conflict detection Hardware approach [ Herlihy’93 ]

Bounded number of locations Maintain versions in cache → Low overhead

Pure-software approach [ Herlihy’03, Harris’03 ] Unbounded number of locations can be accessed within a

transaction Slow due to overhead of maintaining multiple copies

─ Potentially orders of magnitude

Unbounded hardware approach [ Hammond’04, Ananian’05, Rajwar’05, Moore’06 ] Require significant hardware support Discussed in more detail in the paper

Page 5: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 5Intel Labs

Hardware vs. Software TMHardware Approach

Low overhead Buffers transactional

state in Cache More concurrency

Cache-line granularity Bounded resource

Assembly Within a module

Software Approach High overhead

Uses Object copying to keep transactional state

Less Concurrency Object granularity

No resource limits High-level languages Across modules

Useful BUT Limited to library writers

Useful BUT Limited to special data structures

Neither is satisfactory for broader use

Page 6: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 6Intel Labs

This Work

A Hybrid Transactional Memory Scheme

Requires modest hardware support Changes are localized

Supports unbounded number of locations Performance of hardware when within hardware resource

limits ( Low Overhead of pure Hardware TM ) Gracefully fall back to software if the hardware resource limits

are exceeded ( Unbounded resources of Pure software TM )

Experimentally demonstrate effectiveness of our approach

Page 7: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Outline Motivation Proposed Architectural Support Hybrid Transactional Memory Performance Evaluation Conclusions

Page 8: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 8Intel Labs

ISA Extensions Start of a Transaction

Begin Transaction All ( XBA ) or Select ( XBS ) Save Register State ( SSTATE ) Specify handler on abort due to conflict ( XHAND )

During a Transaction Perform memory loads and store Override defaults ( LDX, STX, LDR, STR )

On Transaction Abort Explicit Abort Transaction ( XA ) Restore Register State ( RSTATE )

On Transaction Commit Commit Transaction ( XC )

Page 9: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 9Intel Labs

Baseline CMP Architecture Our proposed changes

Modest and Localized

Modifications to Core L1 $

No changes to Interconnect Coherence Protocol L2 $ Memory

L2 $

Interconnect

L1 $ L1 $ L1 $

Core Core Core

Page 10: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 10Intel Labs

Hardware Support for TMThree requirements: Maintain two versions Detect conflict

Same core: Tag Another core: Cache

coherence

Atomic commit and abort

Bounded Capacity of TM $ Associativity of TM $

and L2

Core

RegularAccesses

Transactional $L1 $

Ta

g Dat

a Ta

gA

ddl.

Tag Old

D

ata

New

D

ata

To Interconnect

Transactional Accesses

L1 $

Page 11: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Outline Motivation Proposed Architectural Support Hybrid Transactional Memory

Existing pure software scheme Our hybrid scheme

Performance Evaluation Conclusions

Page 12: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 12Intel Labs

Pure Software TM [ Herlihy’03 ] We use this Pure Software TM as a starting point Implemented without any special architectural support

using two techniques Use copies of objects to keep transactional state

─ Make modifications on the copy during a transaction Add a level of indirection

─ Switch the versions on when a transaction is committed

Object ContentsObject Pointer

Object Contents

State PointerOldNew

State State Valid Copy

Active Old

Aborted Old

Committed New

Page 13: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 13Intel Labs

Pure Software TM Scheme Cont’d

Object Contents

Object Pointer

Object Contents

State PointerOldNew

State

Object Contents

State PointerOldNew

State

XValid Copy

Before accessing an object within a transaction

Modify

Page 14: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 14Intel Labs

Our Hybrid Transactional Memory Two modes: Hardware and Software mode

The two modes need to coexist Non-solution: Make all threads transition modes in lockstep

Avoid versioning overheads (allocation and copying) in the hardware mode Still incur the indirection overheads

Tricky because it needs to bridge the hardware and software schemes Hardware mode needs to modify data in-place

─ Pure Software TM assumes data is never modified in-place Different sharing granularity

─ Cache-line (Hardware) vs. Object (Software) Different conflict detection scheme

─ Data (Hardware) vs. State (Software)

Page 15: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 15Intel Labs

Hybrid Scheme Example

Object Contents

Object Pointer

Object Contents

State PointerOldNew

State

Object Contents

State PointerOldNew

State

X

In the Software Mode Copy and Modify

In the Hardware Mode Modify in place

Thread 1: HW modeThread 2: HW mode

Thread 3: SW mode

Conflict detected by the threads in the hardware mode

Page 16: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 16Intel Labs

Hybrid Scheme Summary

Object Contents

Object Pointer

Object Contents

State PointerOldNew

State

Conflict DetectionActive Thread Mode

Hardware Software

Conflicting Thread Mode

Hardware

Contents State

Software Object Pointer

State

Sharing GranularityActive Thread Mode

Hardware Software

Conflicting Thread Mode

Hardware

Cache line Object

Software Object Object

Page 17: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Outline Motivation Proposed Architectural Support Hybrid Transactional Memory Performance Evaluation Conclusions

Page 18: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 18Intel Labs

Experimental Framework Infrastructure

Cycle-accurate execution-driven Multi-core simulator Modified GCC

Three microbenchmarks Two scenarios: Low and High Contention Compare four synchronization implementations

Lock Pure Hardware Transactional Memory Pure Software Transactional Memory Hybrid Transactional Memory

Page 19: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 19Intel Labs

Performance

0

1

2

3

4

5

6

1 2 4 8 16 32 64

Lock

TM Pure Hardware

TM Pure Software

TM Hybrid

No

rma

lize

d E

xecu

tion

Tim

e

Number of Cores

Benchmark: Vector-Reduce

Contention: Low

Page 20: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Outline Motivation Proposed Architectural Support Hybrid Transactional Memory Performance Evaluation Conclusions

Page 21: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 21Intel Labs

Conclusions Transactional Memory is a promising approach

Makes parallel programming an easier task Easier to achieve parallel speedup

Hybrid Transactional Memory approach works Requires only modest hardware support Common case: Good performance for most

transactions Uncommon case: Graceful fallback to software mode

when a transaction cannot complete within the hardware bounds

Page 22: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Questions ?

Page 23: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 23Intel Labs

Transactions

A Synchronization Mechanism to coordinate accesses to shared data by concurrent threads (An alternative to locks)

Transaction: A group of operations on shared data

Transaction { A = A – 10; B = B + 10; ... if (error) abort_transaction;}

An API Enhancement: 1. Abort in middle of a transaction o On encountering a error

Page 24: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 24Intel Labs

Transactional Memory (TM) A transaction satisfies the following properties

1) Atomicity: All-or-nothing On Commit: all operations become visible On Abort: none of the operations are performed

2) Isolation (Serializable) The transactions committed appear to have been

performed in some serial order

Additional Properties3) Optimistic concurrency control

Necessary for achieving good parallel speedup

4) Non-blocking (Optional) Avoid Priority Inversion Avoid Convoying

Page 25: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 25Intel Labs

Advantage 1: PerformanceLocks

A

B

L1

L1

A

L1

L1

C

D

L1

L1

Serialized on LocksFiner granularity locks helpsBurden on programmer

Transactions

A

B

C

D

Optimistically execute concurrentlyAbort and restart on data conflictAutomatically done by runtime

A

AData

Conflict

Page 26: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 26Intel Labs

Advantage 2: Reduces Bugs With locks, programmers need to

Remember mapping between shared data and locks that guard them─ Make sure the appropriate locks are held while accessing

shared data

Make lock granularity as small as possible Avoid deadlocks due to locks

All of these can cause subtle bugs

With TM, programmer does not have to deal with these problems

Page 27: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 27Intel Labs

Other Advantages Allows new programming paradigms

Simplifies error handling A new style of programming: Speculate and Verify

Programmer can abort offending transactions

Avoids other problems that locks suffer from Priority Inversion: A low-priority thread can grab a lock and

block a higher-priority thread Convoying: If a thread holding a lock blocks on a high-latency

event (like context-switch or I/O), it can cause other threads to wait for long periods

Fault Tolerant: If a process holding a lock dies, other processes will hang forever

Runtime system can abort offending transactions

Page 28: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 28Intel Labs

0

1

2

3

4

5

6

1 2 4 8 16 32 64

Lock

TM Pure Hardware

TM Pure Software

TM Hybrid

No

rma

lize

d E

xecu

tion

Tim

e

Number of Cores

Benchmark: Vector-Reduce

Contention: Low

Page 29: Hybrid Transactional Memory Sanjeev Kumar, Michael Chu, Christopher Hughes, Partha Kundu, Anthony Nguyen, Intel Labs University of Michigan Intel Labs.

Hybrid Transactional Memory 29Intel Labs

ABCDEF

ABCDEF

ABCDEF

Abcdef Ghijk

Abcdef Ghijk

Abcdef Ghijk

Abcdef Ghijk

Abcdef Ghijk

Abcdef Ghijk


Recommended