11/2/2011 1
The Ordering Requirements of Relativistic and
Reader-Writer Locking Approaches to Shared Data Access
Philip W. Howard
with
Jonathan Walpole, Josh Triplett, Paul E. McKenney
11/2/2011 2
Outline
• The Problem
• The RWL Solution
• The RP Solution
• Other problems and their solutions
• Multiple Writers
• Performance
11/2/2011 3
The Problem
A B C
A C
11/2/2011 4
The Problem
obtain ref drop ref
unlink reclaim
obtain ref drop ref
Reader 1
Reader 2
Writer
11/2/2011 5
The Problem
void init(stuff *node)
{
node->a = COMPUTE_A;
node->b = COMPUTE_B;
node->initd = TRUE;
}
while (!node->initd)
{}
if (node->a) drop_nuke();
if (node->b) drop_napalm();
11/2/2011 6
The Problem
void init(stuff *node)
{
int value;
value = some_computation;
node->a = COMPUTE_A;
node->b = COMPUTE_B;
node->initd = value;
}
11/2/2011 7
The Problem
void init(stuff *node)
{
node->a = COMPUTE_A;
node->b = COMPUTE_B;
node->initd = TRUE;
}
while (!node->initd)
{}
if (node->a) drop_nuke();
if (node->b) drop_napalm();
11/2/2011 8
The Problem
• Compiler reordering
• CPU reordering
• Memory System reordering
11/2/2011 9
How are these dependencies maintained?
obtain ref drop ref
unlink reclaim
obtain ref drop ref
Reader 1
Reader 2
Writer
11/2/2011 10
How does → work?
Thread 1
A: a=1;
Thread 2
B: if (a)
A → B
else
B → A
11/2/2011 11
How does work?
Thread 1
A: a=1;
mb();
C: c=1;
Thread 2
while (!a)
B: A B
Thread 3
if (c)
D: A D
11/2/2011 12
With Reader-Writer Locks
obtain ref drop ref
unlink reclaim
read-lock read-unlock
write-lock write-unlock
11/2/2011 13
With Reader-Writer Locks
obtain ref drop ref
unlink reclaim
read-lock read-unlock
write-lock
write-unlock
11/2/2011 14
With Reader-Writer Locks
obtain ref drop ref
unlink reclaim
read-lock
read-unlock
write-lock write-unlock
11/2/2011 15
Locking primitives must
• Impose the semantics of the lock
• Prevent compiler, CPU, or Memory system from reordering operations across them
11/2/2011 16
With Relativistic Programming
obtain ref drop ref
start-wait end-wait
start-read end-read
unlink reclaim
11/2/2011 17
With Relativistic Programming
obtain ref drop ref
start-wait end-wait
start-read end-read
unlink reclaim
11/2/2011 18
RP primitives must
• Impose the semantics of wait-for-readers
• Prevent compiler, CPU, or Memory system from reordering operations across them
11/2/2011 19
Outline• The Problem• The RWL Solution• The RP Solution• Other problems and their solutions
• Insert• Move Down• Move Up• General Case
• Multiple Writers• Performance
11/2/2011 20
Insert
C D E
C E
11/2/2011 21
Insert
obtain ref deref
init link
obtain ref deref
Reader 1
Reader 2
Writer
11/2/2011 22
How does work?
Thread 1
A: a=1;
mb();
C: c=1;
Thread 2
while (!a)
B: A B
Thread 3
if (c)
D: A D
11/2/2011 23
Insert
Use rp-publish to perform link operation
11/2/2011 24
Move Down
A
B
C
D
F
G
E
H
A
B
C
D
F
GE
H
11/2/2011 25
Move Down
A
B
C
D
F
G
E
H
A
B
C
D
F
GE
H
F’
A
B
C
D
GE
H
F’
1. init F’2. link F’3. unlink F
11/2/2011 26
Move Downinit(F’) link(F’) unlink(F) reclaim(F)
deref(H) deref(F) deref(D) deref(E)
A
B
C
D
F
G
E
H
A
B
C
D
F
GE
H
F’
A
B
C
D
GE
H
F’
11/2/2011 27
Move Downinit(F’) link(F’) unlink(F) reclaim(F)
deref(H) deref(F) deref(D) deref(F’)
A
B
C
D
F
G
E
H
A
B
C
D
F
GE
H
F’
A
B
C
D
GE
H
F’
deref(E)
11/2/2011 28
Move Downinit(F’) link(F’) unlink(F) reclaim(F)
deref(H) deref(D) deref(F’)
A
B
C
D
F
G
E
H
A
B
C
D
F
GE
H
F’
A
B
C
D
GE
H
F’
deref(E)
11/2/2011 29
RBTree Delete with Swap (Move Up)
D
A
B
E
C
F
null D
A
B
E
C
F
null
C’
A E
D
F
C’
11/2/2011 30
Move Up
init(C’) link(C’) unlink(C) reclaim(C)
deref(F) deref(C’)
D
A
B
E
C
F
null D
A
B
E
C
F
null
C’
A E
D
F
C’
11/2/2011 31
Move Upinit(C’) link(C’) unlink(C) reclaim(C)
deref(F) deref(B) deref(E)
D
A
B
E
C
F
deref(C)
null D
A
B
E
C
F
null
C’
A E
D
F
C’
11/2/2011 32
The General Case
• Mutable vs. Immutable data
11/2/2011 33
The General Case for Writers
• Copy nodes to update immutable data or to make a collection of updates appear atomic
• Use rp-publish to update mutable data
• Use wait-for-readers when updates are in traversal orderA B C
11/2/2011 34
The General Case for Readers
• Use rp-read for reading mutable data
• Only dereference mutable data once
if (node->next->key == key) {
return node->next->value;
}
11/2/2011 35
Outline
• The Problem
• The RWL Solution
• The RP Solution
• Other problems and their solutions
• Multiple Writers
• Performance
11/2/2011 36
Multiple Writers
W
R2
R1
11/2/2011 37
Multiple Writers
W1
Reader
W2
11/2/2011 38
The phone company
Can’t wait to get my new phone
Can’t wait to get my new phone
Can’t wait to ditch thisphone
Can’t wait to ditch thisphone
Where’s my phone book?
Where’s my phone book?
11/2/2011 39
Multiple Writers
W1
Reader
W2wait-for-readers
W1
Reader
W2
11/2/2011 40
RP vs RWL delays
W1
Reader
W2
RP
W1
Reader
W2readerpref
W1
Reader
W2writerpref
W1
Reader
W2
TORP
11/2/2011 41
Trigger Events
RP
RWLR
RWLW
11/2/2011 42
Outline
• The Problem
• The RWL Solution
• The RP Solution
• Other problems and their solutions
• Multiple Writers
• Performance
11/2/2011 43
How does work?
Thread 1
A: a=1;
mb();
C: c=1;
Thread 2
while (!a)
B: A B
Thread 3
if (c)
D: A D
11/2/2011 44
Reader-Writer Locks
read-unlock
write-lock write-unlock
read-lock
read-unlockread-lock
11/2/2011 45
Reader-Writer Locks
Reader
while (writing)
{}
reading=1;
Writer
while (reading)
{}
writing=1;
11/2/2011 46
Relativistic Programming
start-wait end-wait
start-read end-read
start-read end-read
11/2/2011 47
Relativistic Programming
start-read(i)
{
reader[i]=1;
}
end-read(i)
{
reader[i]=0;
}
start-wait()
{}
end-wait(){
for (i) { while (reader[i]) {} }}
11/2/2011 48
Relativistic Programming
start-read(i)
{
reader[i]=Epoch;
}
end-read(i)
{
reader[i]=0;
}
start-wait()
{
Epoch++;
my_epoch = Epoch;
}
end-wait()
{
for (i) {
while (reader[i] &&
reader[i] < my_epoch)
{}
}
}
11/2/2011 49
Performance Benefits to RP
• Less expensive read-side primitives
• Less serialization / more concurrency
• Less waiting
11/2/2011 50
Benchmarked Methods
nolock No synchronization
rp Relativistic Programming
torp Totally Ordered RP
rwlr Reader-Writer Lock Read preference
rwlw Reader-Writer Lock Write preference
11/2/2011 51
Read Performance (size=1)
0
50
100
150
200
250
0 5 10 15 20
Mill
ion
s
Threads
op
era
tion
s/se
c
NOLOCK
RP
RWLW
RWLR
11/2/2011 52
Read Performance (size=1000)
0
0.5
1
1.5
2
2.5
3
3.5
0 5 10 15 20
Mill
ion
s
Threads
Ope
ratio
ns/
sec
NOLOCK
RP
RWLW
RWLR
11/2/2011 53
Update Scalability
0
50000
100000
150000
200000
250000
1 10 100 1000 10000
List Size
Op
erat
ion
s/se
c
RP writes
TORP writes
11/2/2011 54
Update Scalability part 2
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
2000000
1 10 100 1000 10000
List Size
Op
erat
ion
s/se
c
RP
TORP
11/2/2011 55
Update Scalability (part 3)
1
10
100
1000
10000
100000
1000000
10000000
100000000
1 10 100 1000 10000 100000 1000000
List Size
Op
era
tio
ns
/se
c
TORP r
RWLW r
TORP
RWLW
11/2/2011 56
Conclusions
• Correctness can be preserved by limiting allowable orderings
• RP read primitives are less expensive than RWL primitives
• RP allows more concurrency and less waiting
11/2/2011 57
Conclusions
• RP can preserve both correctness and scalability or reads