PrincetonUniversity
Performance Isolation and Fairness for Multi-Tenant Cloud Storage
David Shue*, Michael Freedman*, and Anees Shaikh✦
*Princeton ✦IBM Research
2
Setting: Shared Storage in the Cloud
Z
Y
TFZ
Y
FT
S3 EBS SQSShared Key-Value
Storage
3
DD DD DD DDShared Key-Value Storage
Z Y T FZ Y FTY YZ F F F
Multiple co-located tenants ⇒ resource contention
Predictable Performance is Hard
4
DD DD DD DDDD DD DD DDShared Key-Value Storage
Z Y T FZ Y FTY YZ F F F
Fair queuing
@ big ironMultiple co-located tenants ⇒ resource contention
Predictable Performance is Hard
5
Distributed system ⇒ distributed resource allocation
Multiple co-located tenants ⇒ resource contention
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
Predictable Performance is Hard
6
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
pop
ula
ritydata partition
Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource
allocation
Z keyspace T keyspace F keyspaceY keyspace
Predictable Performance is Hard
7
Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F
SS SS SS SS
Skewed object popularity ⇒ variable per-node demand
Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource
allocation
1kBGET10BGET 1kBSET
10BSET
(small reads)(large reads) (large writes) (small writes)
Disparate workloads ⇒ different bottleneck resources
Predictable Performance is Hard
8
Zynga Yelp FoursquareTP
Shared Key-Value Storage
Tenants Want System-wide Resource Guarantees
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
80 kreq/s120 kreq/s 160 kreq/s40 kreq/s
demandf = 120 kreq/s
Skewed object popularity ⇒ variable per-node demand
Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource
allocation
Disparate workloads ⇒ different bottleneck resources
9
Zynga Yelp FoursquareTP
Shared Key-Value Storage
Pisces Provides Weighted Fair-shares
wz = 20%wy = 30% wf = 40%wt = 10%
demandz = 30%
demandf = 30%
Z Y T FZ Y FTY YZ F F F
SS SS SS SS
Skewed object popularity ⇒ variable per-node demand
Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource
allocation
Disparate workloads ⇒ different bottleneck resources
10
Pisces: Predictable Shared Cloud Storage
•Pisces- Per-tenant max-min fair shares of system-wide
resources ~ min guarantees, high utilization
- Arbitrary object popularity
- Different resource bottlenecks
•Amazon DynamoDB- Per-tenant provisioned rates
~ rate limited, non-work conserving
- Uniform object popularity
- Single resource (1kB requests)
11
Tenant A
Predictable Multi-Tenant Key-Value Storage
Tenant BVM VM VM VM VM VM
RS
FQ
GET 1101100
RR
Controller
PP
12
Tenant A
Predictable Multi-Tenant Key-Value Storage
Tenant BVM VM VM VM VM VM
WeightA WeightB
RS
FQ
PP
WA
WA2 WB2
GET 1101100
RR
Controller
WA2 WB2WA1 WB1
13
Strawman: Place Partitions Randomly
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA
WA2 WB2Controlle
r
RR
14
Strawman: Place Partitions Randomly
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA
WA2 WB2
RR
ControllerOverloaded
15
Pisces: Place Partitions By Fairness Constraints
Bin-pack partitions
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
RS
FQ
WA
WA2 WB2
RR
Collect per-partition tenant demand
Controller
16
Pisces: Place Partitions By Fairness Constraints
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
Results in feasible partition placement
RS
FQ
WA
WA2 WB2
RR
Controller
17
Controller
Strawman: Allocate Local Weights Evenly
WA1 = WB1 WA2 = WB2
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
PP
WA
WA2 WB2
RR RS
FQ
Overloaded
18
Pisces: Allocate Local Weights By Tenant Demand
A←B A→B
WA1 > WB1 WA2 < WB2
maxmismatch
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
Compute per-tenant+/- mismatch
PP
WA
WA2 WB2Controlle
r
WA1 = WB1 WA2 = WB2
Reciprocal weight swap
RR RS
FQ
19
Strawman: Select Replicas Evenly
50% 50%
RS
PP
WA
WA2 WB2Controlle
r
Tenant A Tenant BVM VM VM VM VM VM
WeightA WeightB
GET 1101100
RR
WA1 > WB1 WA2 < WB2
FQ
20
Tenant A
Pisces: Select Replicas By Local Weight
60% 40%detect weightmismatch by
request latency
Tenant BVM VM VM
WeightB
Controller
50% 50%RS
PP
WA
WA2 WB2
VM VM VM
WeightA
GET 1101100
WA1 > WB1 WA2 < WB2
FQ
RR
21
Strawman: Queue Tenants By Single Resource
Bandwidth limited Request Limited
bottleneck resource (out bytes) fair share
outreqoutreq
Tenant A Tenant BVM VM VM VM VM VM
Controller
RS
PP
WA
WA2 WB2
FQ
WA2 < WB2WA1 > WB1
RR
GET 1101100
GET 0100111
22
Pisces: Queue Tenants By Dominant Resource
Bandwidth limited Request Limited
outreqoutreq
Tenant A Tenant BVM VM VM VM VM VM
bottlenecked by out bytes
Track per-tenantresource vector
dominant resource fair share
Controller
RS
PP
WA
WA2 WB2
FQ
WA2 < WB2
RR
23
Pisces Mechanisms Solve For Global Fairness
minutessecondsmicrosecondsTimescale
Syste
m V
isib
ilit
ylo
cal
glo
bal
RS
RR
RR
...
SS
SS
...
Con
trolle
r
demand-driven weight allocation
Maximum bottleneck flow weight exchange
FAST-TCP basedreplica
selection
DRR token-basedDRFQ scheduler
Replica Selection Policies
Weig
ht
Allo
cati
ons
fairness and capacity
constraints
PP
WA
WA2 WB2
FQ
24
Evaluation
•Does Pisces achieve (even) system-wide fairness?
-Is each Pisces mechanism necessary for fairness?
-What is the overhead of using Pisces?
•Does Pisces handle mixed workloads?
•Does Pisces provide weighted system-wide fairness?
•Does Pisces provide local dominant resource fairness?
•Does Pisces handle dynamic demand?
•Does Pisces adapt to changes in object popularity?
25
Evaluation
•Does Pisces achieve (even) system-wide fairness?
-Is each Pisces mechanism necessary for fairness?
-What is the overhead of using Pisces?
•Does Pisces handle mixed workloads?
•Does Pisces provide weighted system-wide fairness?
•Does Pisces provide local dominant resource fairness?
•Does Pisces handle dynamic demand?
•Does Pisces adapt to changes in object popularity?
Pisces Achieves System-wide Per-tenant Fairness
Unmodified Membase
Ideal fair share: 110 kreq/s (1kB requests)
Pisces
0.57 MMR 0.98 MMR
Min-Max Ratio: min rate/max rate (0,1]
8 Tenants - 8 Client - 8 Storage NodesZipfian object popularity
distribution
27
Each Pisces Mechanism Contributes to System-wide Fairness and Isolation
Unmodified Membase
0.59 MMR 0.93 MMR 0.98 MMR
0.36 MMR 0.58 MMR 0.74 MMR
RSWA
PPFQWA
PPFQ
0.90 MMR
0.96 MMR 0.97 MMR0.89 MMR
0.57 MMR
FQ
2x vs 1x demand
0.64 MMR
28
Pisces Imposes Low-overhead
< 5%
> 19%
29
Pisces Achieves System-wide Weighted Fairness
0.98 MMR4 heavy hitters20 moderate demand40 low demand
0.56 MMR
0.89 MMR 0.91 MMR
0.91 MMR
30
Pisces Achieves Dominant Resource Fairness
Time (s)
Band
wid
th
(Mb
/s)
GET R
eq
uest
s (k
req/s
)
76% of bandwidth
76% of request rate
Time (s)
1kB workloadbandwidth limited
10B workloadrequest limited
24% of request rate
31
Pisces Adapts to Dynamic Demand
Constant BurstyDiurnal (2x wt)
~2x
even
Tenant Demand
32
Conclusion
•Pisces Contributions- Per-tenant weighted max-min fair shares of system-wide
resources w/ high utilization
- Arbitrary object distributions
- Different resource bottlenecks
- Novel decomposition into 4 complementary mechanisms
PPPartition
Placement WA
RS FQWeight
AllocationReplica
SelectionFair
Queuing