Date post: | 31-Mar-2015 |
Category: |
Documents |
Upload: | felipe-lillard |
View: | 212 times |
Download: | 0 times |
ITCS6010/8010 1
Multi-Tenant Databases for SaaSMulti-Tenant Databases for SaaS(Software as a Service)(Software as a Service)
11/17/08
(Some of the Slides are from Alfons Kemper)
2ITCS6010/8010
OutlineOutline Motivations
Multi-tenancy database (MTD)
MTD database schema layouts– Existing: basic, private, extension, universal, pivot– New: chunk table & chunk folding
Querying chunk tables
Summary
3ITCS6010/8010
SaaS & Multi-TenancySaaS & Multi-Tenancy SaaS (Software-as-a-Service)
– Software managed by SaaS vendor, run on SaaS server
Tenancy = client organization – ~10 users, small/medium business
Multi-instance– separate instance for each tenant
Multi-tenancy – single instance of software serving multiple tenants
4ITCS6010/8010
Why Multi-TenancyWhy Multi-Tenancy Economy of scale
– large number of tenants – sharing the cost of single software instance cost saving for each individual user
Examples?
5ITCS6010/8010
Multi-Tenancy ExampleMulti-Tenancy Example
6ITCS6010/8010
Multi-Tenancy in PracticeMulti-Tenancy in Practice
Complexity of Application
Siz
e o
f M
ach
ine
Blade
Big iron
Small Large
10000
100
CRM
10
ERP
1000
Proj Mgmt
1
Banking
10000 100 101000
10000 1001000# tenants per database
• Economy of scale decreases with application complexity• At the sweet spot, compare TCO of 1 versus 100 databases
7ITCS6010/8010
A Typical Blade ServerA Typical Blade Server
IBM HS20 (Wikipedia)
IBM BladeCenter HS12:
• CPU (single, dual, quad-core xeon 2~3 GHz)• Memory (max): 24 GB• Internal storage (max): 293 GB
24G / (10k tenant)
= 2.4M/tenant
8ITCS6010/8010
Multi-Tenancy in PracticeMulti-Tenancy in Practice
Complexity of Application
Siz
e o
f M
ach
ine
Blade
Big iron
Small Large
10000
100
CRM
10
ERP
1000
Proj Mgmt
1
Banking
10000 100 101000
10000 1001000# tenants per database
• Economy of scale decreases with application complexity• At the sweet spot, compare TCO of 1 versus 100 databases
9ITCS6010/8010
Alternative: Multi-Instance + VirtualizationAlternative: Multi-Instance + Virtualization
10ITCS6010/8010
Multi-Tenant Databases (MTD)Multi-Tenant Databases (MTD) Consolidating multiple businesses onto same operational
system– Consolidation factor dependent on size of the application and the
host machine
Support for schema extensibility– Essential for ERP applications
Support atop of the database layer– Non-intrusive implementation– Query transformation engine maps logical tenant-specific tables to
physical tables– Various problems, for example:
– Various table utilization (“hot spots”)– Metadata management when handling lots of tables
11ITCS6010/8010
OutlineOutline Motivations
Multi-tenancy database (MTD)
MTD database schema layouts – Existing: basic, private, extension, universal, pivot– New: chunk table & chunk folding
Querying chunk tables
Summary
12ITCS6010/8010
Classic Web Application (Basic Layout)Classic Web Application (Basic Layout) Pack multiple tenants into the same tables by adding a
tenant id column
Great consolidation but no extensibility
Account
AcctId
1
2
Name
Acme
Gump
...TenId
17
17
35 1 Ball
42 1 Big
13ITCS6010/8010
Private TablePrivate Table Each tenant gets his/her own private schema
– No sharing– SQL transformation: Renaming only
High meta-data/data ratio– Lots of tables: linear in # of tenants, each with own meta-data– Low buffer utilization (~8k for index/data for each table)
Account
AcctId
1
2
Name
Acme
Gump
...TenId
17
17
35 1 Ball
42 1 Big
14ITCS6010/8010
CRM SchemaCRM Schema
15ITCS6010/8010
Handling Lots of TablesHandling Lots of Tables Simplifying assumption: No extensibility
Experiment setup:– CRM schema with 10 tables– 10,000 tenants are packed onto one DBMS (DB2, 1G Memory)– Data set size remains constant
Parameter: Schema Variability– Number of tenants per schema instance– 0 (least variable): all tenants share one instance (= 10 tables)– 1 (most variable): each tenant has separate instance (= 100k tables)
16ITCS6010/8010
Handling Lots of Tables – ResultsHandling Lots of Tables – Results
10 fully shared Tables
100.000 private Tables
17ITCS6010/8010
Extension TableExtension Table Split off the extensions into separate tables
– Additional join at runtime– Row column for reconstructing the row (discussion: consider Acct17)
Account17(A, N, H, B) =
select Ae.A, Ae.N, Ha.H, Ha.B
from Account_ext as Ae, Healthcare_account as Ha
where Ae.Tenant = Ha.Tenant & Ae.Row = Ha.Row
and Ae.Tenant = 17
18ITCS6010/8010
Extension Table (Cont’d)Extension Table (Cont’d) Good: Better consolidation than Private Table layout
– common attributes go to same table
Bad: Number of tables still grows in proportion to number of tenants– tenants in same domain may often have varied schema– so large # of *extension* tables (such as Healthcare_account)
19ITCS6010/8010
Universal TableUniversal Table Generic structure with VARCHAR value columns
– n-th column of a logical table is mapped to ColN in an universal table– Extensibility: # of columns may expand as needed
Disadvantages– Very wide rows Many NULL values– Not type-safe Casting necessary– No index support (note column index not very meaningful)
logical table
(private after renaming)
20ITCS6010/8010
Pivot TablePivot Table
1. Each field of a row in logical table is given its own row.
2. Multiple pivot tables for each type (int, string, e.g.)
Row: 0
Int
String
21ITCS6010/8010
Reconstruct Tenant Table from Pivot TableReconstruct Tenant Table from Pivot Table
Row: 0
Int
String
Account17(H, B) =
select Ps.Str, Pi.int
from Pivot_str as Ps, Pivot_int as Pi
where Ps.Tenant = Pi.Tenant &
Ps.Table = Pi.Table &
Ps.Row = Pi.Row &
Ps.Col = 2 &
Ps.Col = 3 &
Ps.Tenant = 17 &
Ps.Table = 0
align
(Only Hospital & Beds)
22ITCS6010/8010
Reconstruct Tenant Table (cont’d)Reconstruct Tenant Table (cont’d)
Row: 0
Int
String
Account17(A, N, H, B) =
select Pi’.int, Ps’.str, Ps.Str, Pi.int
from Pivot_int as Pi’, Pivot_str as Ps’,
Pivot_str as Ps, Pivot_int as Pi
where Pi’.Tenant = Ps’.Tenant & Ps’.Tenant = Ps.Tenant &
Ps.Tenant = Pi.Tenant & Pi’.Table = Ps’.Table &
Ps’.Table = Ps.Table & Ps.Table = Pi.Table &
Pi’.Row = Ps’.Row & Ps’.Row = Ps.Row &
Ps.Row = Pi.Row &
Pi’.Col = 0 & Ps’.Col = 1 &
Ps.Col = 2 & Ps.Col = 3 &
Ps.Tenant = 17 & Ps.Table = 0
align: same tenant, table, row
4-way join
23ITCS6010/8010
Pivot Table: PerformancePivot Table: Performance
Generic type-safe structure– Eliminates handling many
NULL values
Performance– Depends on the column
selectivity of the query (number of reconstructing joins)
– E.g., query on A17(H,B) is more selective than A17(A,N,H,B)
– See previous example
Row: 0
Int
String
24ITCS6010/8010
Pivot Table Pivot Table Chunk Table Chunk Table
Row: 0 Row: 0
Chunk 0 Chunk 1Field 0 Field 1 Field 2 Field 3
Pivot table Chunk table
one row for each field one row for each chunk
25ITCS6010/8010
How to Chunk or Fragment Original RowsHow to Chunk or Fragment Original Rows Many possible fragmentations
One idea: group fields by their “popularity”
26ITCS6010/8010
Chunk TableChunk Table
Row: 0
Chunk 0 Chunk 1
Account17(A, N, H, B) =
select Cis.Int1, Cis.Str1,
Cis’.Str1, Cis’.Int1
from Chunk Cis, Chunk Cis’
where Cis.Chunk = 0 &
Cis’.Chunk =1 &
Cis.Row = Cis’.Row &
Cis.Tenant = Cis’.Tenant &
Cis.Table = Cis’.Table
2. merge chunk 1. align rows
3. reconstruct original row
2-way join
27ITCS6010/8010
Chunk Table vs. Universal TableChunk Table vs. Universal Table
Chunk 0 Chunk 1
# of rows (for each original row) = # of chunks
Chunk 0
only one row for each original row
Universal as extreme chunking: only one chunk per row
28ITCS6010/8010
Chunk Table PerformanceChunk Table Performance Fewer joins for reconstruction
Indexable
Reduced meta-data/data ratio dependant on chunk size
Row: 0
Chunk 0 Chunk 1
29ITCS6010/8010
Chunk Folding: Alternative ChunkingChunk Folding: Alternative Chunking Mixes Extension and
Chunk Tables
Optimal row fragmentation depends on, e.g.– Workload– Data distribution– Data popularity
base table
chunk the extensions
30ITCS6010/8010
Querying Chunk TablesQuerying Chunk Tables Query Transformation
– Row reconstruction needs many self- and equi-joins– Can be automatically translated
Compilation Scheme:1. Collect all table names and their corresponding columns from the
logical source query
2. Obtain table definitions: for each table
a. obtain the Chunk Tables and the meta-data identifiers representing the used columns
b. generate a query that filters the correct columns (based on the meta-data identifiers) and aligns the different chunk relations on their ROW column.
3. Extend each table reference in the logical source query by corresponding table definition (obtained in step 2)
31ITCS6010/8010
Query Example: Step 1Query Example: Step 1
Step 1 result:
tables = {Account17}
columns = {Account17.Beds, Account17.Hospital}
32ITCS6010/8010
Query Example: Step 2 (Pivot Table)Query Example: Step 2 (Pivot Table)
Int String
33ITCS6010/8010
Query Example: Step 3 (Chunk Table)Query Example: Step 3 (Chunk Table)
Chunk 0 Chunk 1
35ITCS6010/8010
SummarySummary Multi-tenancy database critical to scale SaaS solution
Varied schema layout schemes– different degrees of consolidation & extensibility– optimal layout depends on particular data set, work load, etc.
Novel Chunk Table layout– wider chunks tend to achieve better performance– improvement slows down at some point (e.g., |chunk| = 15)– response time using chunking may approach conventional tables
Need more work on finding good layout w/ multiple factors