Predicting Replicated Database Scalability
Sameh Elnikety, Microsoft Research Steven Dropsho, Google Inc.Emmanuel Cecchet, Univ. of Mass.Willy Zwaenepoel, EPFL
• Environment– E-commerce website– DB throughput is 500 tps
• Is 5000 tps achievable?– Yes: use 10 replicas– Yes: use 16 replicas – No: faster machines needed
• How tx workload scales on replicated db?
Motivation
SingleDBMS
2
Read Tx
Replica 2
Replica 1
Replica 3
Load Balancer
T
5
Read tx does not change DB state
Read tx does not change DB state
Update Tx
Replica 2
Replica 1
Replica 3
CertLoad
Balancer
TTwsws wswswswswsws
6
Update tx changesDB state
Update tx changesDB state
Additional Replica
Replica 2
Replica 1
Replica 3
Load Balancer T wsws
Replica 3
7
Replica 4
Cert
wswswsws
• Standalone DBMS– Service demands
• Multi-master system– Service demands– Queuing model
• Experimental validation
Coming Up …
8
• Required– readonly tx: R – update tx: W
• Transaction load– readonly tx: R
– update tx: W / (1 - A1)
Standalone DBMS
SingleDBMS
Abort probability is A1 Submit W / (1 - A1) update tx
Commited tx: WAborted tx: W ∙ A1 / (1- A1)
Abort probability is A1 Submit W / (1 - A1) update tx
Commited tx: WAborted tx: W ∙ A1 / (1- A1) 9
Standalone DBMS
SingleDBMS
1
(1)(1 )
WLoad R rc wc
A
10
• Required– readonly tx: R – update tx: W
• Transaction load– readonly tx: R
– update tx: W / (1 - A1)
• Required (whole system of N replicas)– Readonly tx: N ∙ R – Update tx: N ∙ W
• Transaction load per replica– Readonly tx: R
– Update tx: W / (1 - AN)
– Writeset: W ∙ (N - 1)
Multi-Master with N Replicas
( 1)(1 )
( )N
MM
WR rc wc W N ws
ALoad N
12
MM Service Demand
( 1)(1 )
( )N
MM
WR rc wc W N ws
ALoad N
( )(1 )
1)N
MM
PwN Pr rc wc Pw ws
AD N
13Explosive cost!
Compare: Standalone vs MM
( )(1 )
1)N
MM
PwN Pr rc wc Pw ws
AD N
Explosive cost!
1
(1)(1 )
PwD Pr rc wc
A
14
• Standalone:
• Multi-Master:
Readonly Workload
( )(1 )
1)N
MM
PwN Pr rc wc Pw ws
AD N
Explosive cost!
1
(1)(1 )
PwD Pr rc wc
A
15
• Standalone:
• Multi-Master:
Update Workload
( )(1 )
1)N
MM
PwN Pr rc wc Pw ws
AD N
Explosive cost!
1
(1)(1 )
PwD Pr rc wc
A
16
• Standalone:
• Multi-Master:
Closed-Loop Queuing Model
Replica i
LB
LB
LB
...
CPU
Disk
TT
TT
TT
Cert
Cert
Cert
Think time
Load balancer
& network
delay
Certifier delay
Pw..
.
...
N replicas
17
• Standard algorithm
• Iterates over the number of clients
• Inputs:– Number of clients– Service demand at service centers– Delay time at delay centers
• Outputs:– Response time– Throughput
Mean Value Analysis (MVA)
18
Using the Model
Replica i
LB
LB
LB
...
CPU
Disk
TT
TT
TT
Cert
Cert
Cert
Think time
Load balancer
& network
delay
Certifier delay
Pw..
.
...
N replicas
19
• Copy of database
• Log all txs, (Pr : Pw)
• Python script replays txs– Readonly (rc)– Updates (wc)
• Writesets– Instrument db with triggers– Play txs to log writesets– Play writesets (ws)
Standalone Profiling (Offline)
20
Using the Model
Replica i
LB
LB
LB
...
CPU
Disk
TT
TT
TT
Cert
Cert
Cert
Think time
Load balancer
& network
delay
Certifier delay
Pw..
.
...
N replicas
# clients, think time
1.5 ∙ fsync()
1 ms
23
• Compare– Measured performance vs model predictions
• Environment– Linux cluster running PostgreSQL
• TPC-W workload– Browsing (5% update txs)– Shopping (20% update txs)– Ordering (50% update txs)
• RUBiS workload– Browsing (0% update txs)– Bidding (20% update txs)
Experimental Validation
24
• Database system– Snapshot isolation– No hotspots– Low abort rates
• Server system– Scalable server (no thrashing)
• Queuing model & MVA– Exponential distribution for service demands
Model Assumptions
29
• Models– Single-Master– Multi-Master
• Experimental results– TPC-W– RUBiS
• Sensitivity analysis– Abort rates– Certifier delay
Checkout the Paper
30
Urgaonkar, Pacifici, Shenoy, Spreitzer, Tantawi.
“An analytical model for multi-tier internet services and its applications.” Sigmetrics 2005.
Related Work
31
• Derived an analytical model– Predicts workload scalability
• Implemented replicated systems– Multi-master– Single-master
• Experimental validation– TPC-W– RUBiS– Throughput predictions match within 15%
Conclusions
32