MetaTM & TxLinux

Post on 30-Dec-2015

14 views 0 download

Tags:

description

MetaTM & TxLinux Hany Ramadan, Christopher Rossbach, Donald Porter, Owen Hofmann, Aditya Bhandari, Emmett Witchel University of Texas at Austin. TM Background. Transactional programming is an emerging alternative to locks Avoids problems such as deadlock - PowerPoint PPT Presentation

transcript

MetaTM & TxLinux

Hany Ramadan, Christopher Rossbach, Donald Porter, Owen Hofmann, Aditya Bhandari, Emmett

Witchel

University of Texas at Austin

TM Background

Transactional programming is an emerging alternative to locks Avoids problems such as deadlock Avoids performance-complexity tradeoffs

HTM holds the promise of simpler programming and good performance

TM: “What’s the OS got to do with it?”

Lack of realistic workloads (counter, splash-2)

Will current results hold on real programs? Unclear design tradeoffs; Feature set unsettled

OS is a real-life, parallel workload OS will benefit from transactions

Reduces synchronization complexity System-call and interrupt control paths will benefit

Architectural support is needed for OS

Average Transaction Count

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

Other TMs Nearest TM MetaTM

Ave

rag

e T

x/B

ench

mar

k

Outline

TxLinux MetaTM

Goals Features Interrupt handling

Issue: Stack memory Experimental results

TxLinux 2.6.16.1

Slaballocator

Zoneallocator

IP routing

Socketlocking

Various MMstructures

Directorycache

Pathnametranslation

Memory managementNetworkingFile system

Spin-locksSequencelocks

RCU(read-copy-update)

Converted ~30% of dynamic synchronization to transactions

MetaTM: Design goals

HTM model co-designed with TxLinux Extensions to x86 ISA Architectural support for OS Execution-driven simulation

A platform for TM research Multiple HTM design points Eager & lazy version management Eager conflict detection

MetaTM: Model features

commit cost(lazy)

abort cost(eager)

polite karma

exponential linear

eruption

Versionmanagement

Contentionmanagement(eager)

Backoff policy

Tx demarcation

random

timestamp polka sizematters

xbegin xend

xpush xpopMultiple Tx

TxLinux: Interrupt handling

Question: What happens to active tx on an interrupt?

Interrupt handlers allowed to use transactions

Factors weighing against abort Transaction length growing Interrupt frequency

Answer: Active transactions are suspended on interrupt

MetaTM: Multiple Tx support

Multiple active transactions on a processor At most one running, all others are suspended

Interface xpush suspends current transaction xpop resumes suspended transaction Suspended transactions maintained in LIFO order

New execution context is unrelated to old one Same conflict semantics with all other transactions

May start new transactions

Outline

TxLinux MetaTM

Goals Features Interrupt handling

Issue: Stack memory Experimental results

Issue: Stack memory

Transactions can span stack frames Why: Retain same flexibility as locks Problem: Live stack overwrite (correctness)

Solution: Stack Pointer Checkpointfoo()

{

atomic

{

}

}

foo()

{

bar()

baz()

}

bar() { xbegin }

baz() { xend }

intr statedo_IRQ

foo+8bar

Live stack overwrite

0xC0

StkPtr

localsfoo

0x80

0x40

0x00

Error: invalidreturn address

Tx Reg. Checkpoint

PC: bar+4

StkPtr: 0x40

(other regs..)

foo+4: call barfoo+8: <work>foo+12:xend

bar+0: xbeginbar+4: ret

do_irq: iret

Conflict

localsfoo

Only interrupts that arrive in kernel mode have this problem

intr statedo_irq

foo+8bar

Live stack overwrite, fixed

0xC0

StkPtr

localsfoo

0x80

0x40

0x00Tx Reg. Checkpoint

PC: bar+4

StkPtr: 0x40

(other regs..)

foo+4: call barfoo+8: <work>foo+12:xend

bar+0: xbeginbar+4: ret

do_irq: iret

localsfoo

Conflict

Fixed by setting ESP to Checkpointed ESP on interrupt

Outline

TxLinux MetaTM

Goals Features Interrupt handling

Issue: Stack memory Experimental results

Experiments

Setup Workloads System characteristics

Execution time Transaction rates Transaction origins

Studies Contention management Commit & Abort penalties

Setup

Simics 3.0.17 8-processor, x86 system (1 Ghz) Memory hierarchy

L1: sep D/I, 16KB, 4-way, 1-cycle hit L2: 4MB, 8-way, 16-cycle hit, MESI protocol

Main memory: 1GB, 200-cycle hit Other devices

Disk device (DMA, 5.5ms latency) Tigon3 gigabit nic (DMA,0.1ms latency)

Workloads to exercise TxLinux

counter shared counter micro- benchmark (8 threads)

pmake Runs make -j 8 to compile files from libFLAC 1.1.2

netcat streams data over TCP network conn.

MAB simulates software development file system workloads

configure 8 instances of configure for tetex

find 8 instances of find on a 78MB directory searching for text

Note: Only TxLinux creates transactions

Kernel Execution Time

0

2

4

6

8

10

12

14

counter pmake netcat MAB config find

Ker

nel

Exe

cuti

on

Tim

e (s

)

Linux

TxLinux

counter

%Kern. time 91% 13% 54% 57% 43% 50%

High kernel time justifies transactions in the OS

Transaction Rates

32,48616,635

449,322182,072 121,808

1

10

100

1,000

10,000

100,000

1,000,000

pmake netcat MAB config find

Tra

ns

ac

tio

ns

/ S

ec

Restart Rate 2.6% 3.1% 1.7% 2.1% 10.2%

Find workload has highest contention in TxLinux

Transaction Origins

0

20

40

60

80

100

pmake netcat MAB config find

% T

ran

sact

ion

s

System calls

Interrupts, kthreads

Kernel locks accessed from both system call and interrupt handling contexts

Contention Management Study

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

eruption karma kindergarten polka size matters timestamp

Policy

Norm

alized R

esta

rt R

ate

.

counter

pmake

netcat

MAB

configure

find

Polka best performer, but complex to implement; SizeMatters viable

Stall-on-conflict – reduces conflicts, but not always performance

counter

Commit & Abort Study

0

0.5

1

1.5

2

2.5

3

counter pmake netcat MAB configure find

Re

lati

ve

Sy

ste

m T

ime

0

100

1,000

10,000

0.75

0.80

0.85

0.90

0.95

1.00

1.05

1.10

1.15

1.20

1.25

counter pmake netcat MAB configure find

Re

lati

ve

Sy

ste

m T

ime

0

100

1,000

10,000

Performance sensitive to commit penalty, not abort

Confirms benefit of eager version management (fast commits)

Nor

mal

ized

Ker

nel T

ime

Nor

mal

ized

Ker

nel T

ime

Commit Cost

Abort Cost

Related Work

TM Models TCC [Hammond04], UTM [Anaian05], LogTM [Moore06], VTM [Rajwar05]

Suspension techniques Escape actions [Zilles06] – can’t start tx

Interrupt handling XTM [Chung06] – also tries to avoid aborts

Contention management Scherer & Scott [PODC’05] – in STM context

Conclusions

TM needs realistic workloads TxLinux the largest TM benchmark

OS needs TM Complex synchronization; large % of runtime

Building & running TxLinux reveals much Architectural support needed (Tx suspension)

Contention management is important Cost studies confirm fast commits

… more in the paper