Post on 21-Mar-2016
description
transcript
MetaTM & TxLinux
Hany Ramadan, Christopher Rossbach, Donald Porter, Owen Hofmann, Aditya Bhandari, Emmett
Witchel
University of Texas at Austin
TM Background Transactional programming is an emerging alternative to locks Avoids problems such as deadlock Avoids performance-complexity tradeoffs
HTM holds the promise of simpler programming and good performance
TM: “What’s the OS got to do with it?”
Lack of realistic workloads (counter, splash-2)
Will current results hold on real programs? Unclear design tradeoffs; Feature set unsettled
OS is a real-life, parallel workload OS will benefit from transactions
Reduces synchronization complexity System-call and interrupt control paths will benefit
Architectural support is needed for OS
Average Transaction Count
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
Other TMs Nearest TM MetaTM
Aver
age
Tx/B
ench
mar
k
Outline TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
TxLinux 2.6.16.1
Slaballocator
Zoneallocator
IP routing
Socketlocking
Various MMstructures
Directorycache
Pathnametranslation
Memory managementNetworkingFile system
Spin-locksSequencelocks
RCU(read-copy-update)
Converted ~30% of dynamic synchronization to transactions
MetaTM: Design goals HTM model co-designed with TxLinux
Extensions to x86 ISA Architectural support for OS Execution-driven simulation
A platform for TM research Multiple HTM design points Eager & lazy version management Eager conflict detection
MetaTM: Model features
commit cost(lazy)
abort cost(eager)
polite karma
exponential linear
eruption
Versionmanagement
Contentionmanagement(eager)
Backoff policy
Tx demarcation
random
timestamp polka sizematters
xbegin xend
xpush xpopMultiple Tx
TxLinux: Interrupt handling
Question: What happens to active tx on an interrupt?
Interrupt handlers allowed to use transactions
Factors weighing against abort Transaction length growing Interrupt frequency
Answer: Active transactions are suspended on interrupt
MetaTM: Multiple Tx support
Multiple active transactions on a processor At most one running, all others are suspended
Interface xpush suspends current transaction xpop resumes suspended transaction Suspended transactions maintained in LIFO order
New execution context is unrelated to old one Same conflict semantics with all other transactions
May start new transactions
Outline TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
Issue: Stack memory Transactions can span stack frames
Why: Retain same flexibility as locks Problem: Live stack overwrite (correctness)
Solution: Stack Pointer Checkpointfoo() { atomic { }}
foo(){ bar() baz()}bar() { xbegin }baz() { xend }
intr statedo_IRQ
foo+8bar
Live stack overwrite0xC0
StkPtr
localsfoo
0x80
0x40
0x00
Error: invalidreturn address
Tx Reg. Checkpoint
PC: bar+4
StkPtr: 0x40
(other regs..)
foo+4: call barfoo+8: <work>foo+12:xend
bar+0: xbeginbar+4: ret
do_irq: iret
Conflict
localsfoo
Only interrupts that arrive in kernel mode have this problem
intr statedo_irq
foo+8bar
Live stack overwrite, fixed
0xC0
StkPtr
localsfoo
0x80
0x40
0x00Tx Reg. Checkpoint
PC: bar+4
StkPtr: 0x40
(other regs..)
foo+4: call barfoo+8: <work>foo+12:xend
bar+0: xbeginbar+4: ret
do_irq: iret
localsfoo
Conflict
Fixed by setting ESP to Checkpointed ESP on interrupt
Outline TxLinux MetaTM
Goals Features Interrupt handling
Issue: Stack memory Experimental results
Experiments Setup Workloads System characteristics
Execution time Transaction rates Transaction origins
Studies Contention management Commit & Abort penalties
Setup Simics 3.0.17 8-processor, x86 system (1 Ghz) Memory hierarchy
L1: sep D/I, 16KB, 4-way, 1-cycle hit L2: 4MB, 8-way, 16-cycle hit, MESI protocol
Main memory: 1GB, 200-cycle hit Other devices
Disk device (DMA, 5.5ms latency) Tigon3 gigabit nic (DMA,0.1ms latency)
Workloads to exercise TxLinux
counter shared counter micro- benchmark (8 threads)
pmake Runs make -j 8 to compile files from libFLAC 1.1.2
netcat streams data over TCP network conn.
MAB simulates software development file system workloads
configure 8 instances of configure for tetex
find 8 instances of find on a 78MB directory searching for text
Note: Only TxLinux creates transactions
Kernel Execution Time
0
2
4
6
8
10
12
14
counter pmake netcat MAB config find
Ker
nel E
xecu
tion
Tim
e (s
)
Linux
TxLinux
counter
%Kern. time 91% 13% 54% 57% 43% 50%
High kernel time justifies transactions in the OS
Transaction Rates
32,48616,635
449,322182,072 121,808
1
10
100
1,000
10,000
100,000
1,000,000
pmake netcat MAB config find
Tran
sact
ions
/ Se
c
Restart Rate 2.6% 3.1% 1.7% 2.1% 10.2%
Find workload has highest contention in TxLinux
Transaction Origins
0
20
40
60
80
100
pmake netcat MAB config find
% T
rans
actio
ns
System calls
Interrupts, kthreads
Kernel locks accessed from both system call and interrupt handling contexts
Contention Management Study
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
eruption karma kindergarten polka size matters timestamp
Policy
Nor
mal
ized
Res
tart
Rat
e . counter
pmake
netcat
MAB
configure
find
Polka best performer, but complex to implement; SizeMatters viable
Stall-on-conflict – reduces conflicts, but not always performance
counter
Commit & Abort Study
0
0.5
1
1.5
2
2.5
3
counter pmake netcat MAB configure find
Rel
ativ
e Sy
stem
Tim
e
01001,00010,000
0.750.800.850.900.951.001.051.101.151.201.25
counter pmake netcat MAB configure find
Rel
ativ
e Sy
stem
Tim
e
01001,00010,000
Performance sensitive to commit penalty, not abort Confirms benefit of eager version management (fast
commits)
Nor
mal
ized
Ker
nel T
ime
Nor
mal
ized
Ker
nel T
ime
Commit Cost
Abort Cost
Related Work TM Models
TCC [Hammond04], UTM [Anaian05], LogTM [Moore06], VTM [Rajwar05]
Suspension techniques Escape actions [Zilles06] – can’t start tx
Interrupt handling XTM [Chung06] – also tries to avoid aborts
Contention management Scherer & Scott [PODC’05] – in STM context
Conclusions TM needs realistic workloads
TxLinux the largest TM benchmark OS needs TM
Complex synchronization; large % of runtime Building & running TxLinux reveals much
Architectural support needed (Tx suspension)
Contention management is important Cost studies confirm fast commits
… more in the paper