+ All Categories
Home > Documents > ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides...

ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides...

Date post: 31-Mar-2015
Category:
Upload: felipe-lillard
View: 212 times
Download: 0 times
Share this document with a friend
Popular Tags:
34
ITCS6010/8010 1 Multi-Tenant Databases for Multi-Tenant Databases for SaaS SaaS (Software as a Service) (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)
Transcript
Page 1: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

ITCS6010/8010 1

Multi-Tenant Databases for SaaSMulti-Tenant Databases for SaaS(Software as a Service)(Software as a Service)

11/17/08

(Some of the Slides are from Alfons Kemper)

Page 2: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

2ITCS6010/8010

OutlineOutline Motivations

Multi-tenancy database (MTD)

MTD database schema layouts– Existing: basic, private, extension, universal, pivot– New: chunk table & chunk folding

Querying chunk tables

Summary

Page 3: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

3ITCS6010/8010

SaaS & Multi-TenancySaaS & Multi-Tenancy SaaS (Software-as-a-Service)

– Software managed by SaaS vendor, run on SaaS server

Tenancy = client organization – ~10 users, small/medium business

Multi-instance– separate instance for each tenant

Multi-tenancy – single instance of software serving multiple tenants

Page 4: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

4ITCS6010/8010

Why Multi-TenancyWhy Multi-Tenancy Economy of scale

– large number of tenants – sharing the cost of single software instance cost saving for each individual user

Examples?

Page 5: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

5ITCS6010/8010

Multi-Tenancy ExampleMulti-Tenancy Example

Page 6: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

6ITCS6010/8010

Multi-Tenancy in PracticeMulti-Tenancy in Practice

Complexity of Application

Siz

e o

f M

ach

ine

Blade

Big iron

Small Large

10000

Email

100

CRM

10

ERP

1000

Proj Mgmt

1

Banking

10000 100 101000

10000 1001000# tenants per database

• Economy of scale decreases with application complexity• At the sweet spot, compare TCO of 1 versus 100 databases

Page 7: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

7ITCS6010/8010

A Typical Blade ServerA Typical Blade Server

IBM HS20 (Wikipedia)

IBM BladeCenter HS12:

• CPU (single, dual, quad-core xeon 2~3 GHz)• Memory (max): 24 GB• Internal storage (max): 293 GB

24G / (10k tenant)

= 2.4M/tenant

Page 8: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

8ITCS6010/8010

Multi-Tenancy in PracticeMulti-Tenancy in Practice

Complexity of Application

Siz

e o

f M

ach

ine

Blade

Big iron

Small Large

10000

Email

100

CRM

10

ERP

1000

Proj Mgmt

1

Banking

10000 100 101000

10000 1001000# tenants per database

• Economy of scale decreases with application complexity• At the sweet spot, compare TCO of 1 versus 100 databases

Page 9: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

9ITCS6010/8010

Alternative: Multi-Instance + VirtualizationAlternative: Multi-Instance + Virtualization

Page 10: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

10ITCS6010/8010

Multi-Tenant Databases (MTD)Multi-Tenant Databases (MTD) Consolidating multiple businesses onto same operational

system– Consolidation factor dependent on size of the application and the

host machine

Support for schema extensibility– Essential for ERP applications

Support atop of the database layer– Non-intrusive implementation– Query transformation engine maps logical tenant-specific tables to

physical tables– Various problems, for example:

– Various table utilization (“hot spots”)– Metadata management when handling lots of tables

Page 11: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

11ITCS6010/8010

OutlineOutline Motivations

Multi-tenancy database (MTD)

MTD database schema layouts – Existing: basic, private, extension, universal, pivot– New: chunk table & chunk folding

Querying chunk tables

Summary

Page 12: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

12ITCS6010/8010

Classic Web Application (Basic Layout)Classic Web Application (Basic Layout) Pack multiple tenants into the same tables by adding a

tenant id column

Great consolidation but no extensibility

Account

AcctId

1

2

Name

Acme

Gump

...TenId

17

17

35 1 Ball

42 1 Big

Page 13: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

13ITCS6010/8010

Private TablePrivate Table Each tenant gets his/her own private schema

– No sharing– SQL transformation: Renaming only

High meta-data/data ratio– Lots of tables: linear in # of tenants, each with own meta-data– Low buffer utilization (~8k for index/data for each table)

Account

AcctId

1

2

Name

Acme

Gump

...TenId

17

17

35 1 Ball

42 1 Big

Page 14: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

14ITCS6010/8010

CRM SchemaCRM Schema

Page 15: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

15ITCS6010/8010

Handling Lots of TablesHandling Lots of Tables Simplifying assumption: No extensibility

Experiment setup:– CRM schema with 10 tables– 10,000 tenants are packed onto one DBMS (DB2, 1G Memory)– Data set size remains constant

Parameter: Schema Variability– Number of tenants per schema instance– 0 (least variable): all tenants share one instance (= 10 tables)– 1 (most variable): each tenant has separate instance (= 100k tables)

Page 16: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

16ITCS6010/8010

Handling Lots of Tables – ResultsHandling Lots of Tables – Results

10 fully shared Tables

100.000 private Tables

Page 17: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

17ITCS6010/8010

Extension TableExtension Table Split off the extensions into separate tables

– Additional join at runtime– Row column for reconstructing the row (discussion: consider Acct17)

Account17(A, N, H, B) =

select Ae.A, Ae.N, Ha.H, Ha.B

from Account_ext as Ae, Healthcare_account as Ha

where Ae.Tenant = Ha.Tenant & Ae.Row = Ha.Row

and Ae.Tenant = 17

Page 18: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

18ITCS6010/8010

Extension Table (Cont’d)Extension Table (Cont’d) Good: Better consolidation than Private Table layout

– common attributes go to same table

Bad: Number of tables still grows in proportion to number of tenants– tenants in same domain may often have varied schema– so large # of *extension* tables (such as Healthcare_account)

Page 19: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

19ITCS6010/8010

Universal TableUniversal Table Generic structure with VARCHAR value columns

– n-th column of a logical table is mapped to ColN in an universal table– Extensibility: # of columns may expand as needed

Disadvantages– Very wide rows Many NULL values– Not type-safe Casting necessary– No index support (note column index not very meaningful)

logical table

(private after renaming)

Page 20: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

20ITCS6010/8010

Pivot TablePivot Table

1. Each field of a row in logical table is given its own row.

2. Multiple pivot tables for each type (int, string, e.g.)

Row: 0

Int

String

Page 21: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

21ITCS6010/8010

Reconstruct Tenant Table from Pivot TableReconstruct Tenant Table from Pivot Table

Row: 0

Int

String

Account17(H, B) =

select Ps.Str, Pi.int

from Pivot_str as Ps, Pivot_int as Pi

where Ps.Tenant = Pi.Tenant &

Ps.Table = Pi.Table &

Ps.Row = Pi.Row &

Ps.Col = 2 &

Ps.Col = 3 &

Ps.Tenant = 17 &

Ps.Table = 0

align

(Only Hospital & Beds)

Page 22: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

22ITCS6010/8010

Reconstruct Tenant Table (cont’d)Reconstruct Tenant Table (cont’d)

Row: 0

Int

String

Account17(A, N, H, B) =

select Pi’.int, Ps’.str, Ps.Str, Pi.int

from Pivot_int as Pi’, Pivot_str as Ps’,

Pivot_str as Ps, Pivot_int as Pi

where Pi’.Tenant = Ps’.Tenant & Ps’.Tenant = Ps.Tenant &

Ps.Tenant = Pi.Tenant & Pi’.Table = Ps’.Table &

Ps’.Table = Ps.Table & Ps.Table = Pi.Table &

Pi’.Row = Ps’.Row & Ps’.Row = Ps.Row &

Ps.Row = Pi.Row &

Pi’.Col = 0 & Ps’.Col = 1 &

Ps.Col = 2 & Ps.Col = 3 &

Ps.Tenant = 17 & Ps.Table = 0

align: same tenant, table, row

4-way join

Page 23: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

23ITCS6010/8010

Pivot Table: PerformancePivot Table: Performance

Generic type-safe structure– Eliminates handling many

NULL values

Performance– Depends on the column

selectivity of the query (number of reconstructing joins)

– E.g., query on A17(H,B) is more selective than A17(A,N,H,B)

– See previous example

Row: 0

Int

String

Page 24: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

24ITCS6010/8010

Pivot Table Pivot Table Chunk Table Chunk Table

Row: 0 Row: 0

Chunk 0 Chunk 1Field 0 Field 1 Field 2 Field 3

Pivot table Chunk table

one row for each field one row for each chunk

Page 25: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

25ITCS6010/8010

How to Chunk or Fragment Original RowsHow to Chunk or Fragment Original Rows Many possible fragmentations

One idea: group fields by their “popularity”

Page 26: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

26ITCS6010/8010

Chunk TableChunk Table

Row: 0

Chunk 0 Chunk 1

Account17(A, N, H, B) =

select Cis.Int1, Cis.Str1,

Cis’.Str1, Cis’.Int1

from Chunk Cis, Chunk Cis’

where Cis.Chunk = 0 &

Cis’.Chunk =1 &

Cis.Row = Cis’.Row &

Cis.Tenant = Cis’.Tenant &

Cis.Table = Cis’.Table

2. merge chunk 1. align rows

3. reconstruct original row

2-way join

Page 27: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

27ITCS6010/8010

Chunk Table vs. Universal TableChunk Table vs. Universal Table

Chunk 0 Chunk 1

# of rows (for each original row) = # of chunks

Chunk 0

only one row for each original row

Universal as extreme chunking: only one chunk per row

Page 28: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

28ITCS6010/8010

Chunk Table PerformanceChunk Table Performance Fewer joins for reconstruction

Indexable

Reduced meta-data/data ratio dependant on chunk size

Row: 0

Chunk 0 Chunk 1

Page 29: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

29ITCS6010/8010

Chunk Folding: Alternative ChunkingChunk Folding: Alternative Chunking Mixes Extension and

Chunk Tables

Optimal row fragmentation depends on, e.g.– Workload– Data distribution– Data popularity

base table

chunk the extensions

Page 30: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

30ITCS6010/8010

Querying Chunk TablesQuerying Chunk Tables Query Transformation

– Row reconstruction needs many self- and equi-joins– Can be automatically translated

Compilation Scheme:1. Collect all table names and their corresponding columns from the

logical source query

2. Obtain table definitions: for each table

a. obtain the Chunk Tables and the meta-data identifiers representing the used columns

b. generate a query that filters the correct columns (based on the meta-data identifiers) and aligns the different chunk relations on their ROW column.

3. Extend each table reference in the logical source query by corresponding table definition (obtained in step 2)

Page 31: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

31ITCS6010/8010

Query Example: Step 1Query Example: Step 1

Step 1 result:

tables = {Account17}

columns = {Account17.Beds, Account17.Hospital}

Page 32: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

32ITCS6010/8010

Query Example: Step 2 (Pivot Table)Query Example: Step 2 (Pivot Table)

Int String

Page 33: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

33ITCS6010/8010

Query Example: Step 3 (Chunk Table)Query Example: Step 3 (Chunk Table)

Chunk 0 Chunk 1

Page 34: ITCS6010/8010 1 Multi-Tenant Databases for SaaS (Software as a Service) 11/17/08 (Some of the Slides are from Alfons Kemper)

35ITCS6010/8010

SummarySummary Multi-tenancy database critical to scale SaaS solution

Varied schema layout schemes– different degrees of consolidation & extensibility– optimal layout depends on particular data set, work load, etc.

Novel Chunk Table layout– wider chunks tend to achieve better performance– improvement slows down at some point (e.g., |chunk| = 15)– response time using chunking may approach conventional tables

Need more work on finding good layout w/ multiple factors


Recommended