8/8/2012
1
Sebastian Meine
SQL Stylist with sqlity.net
Column Store Internals
Outline
Outline
Column Store
Storage
Aggregates
Batch Processing
History
8/8/2012
2
History
First mention of idea to cluster column groups into separate files
[J. A. Hoffer, D. G. Severance] 1975
First suggestion of fully decomposed storage
[G. P. Copeland and S. Khoshafian] 1985
First commercial columnar database:
Sybase IQ 1996
First general-purpose DBMS to fully integrate columnar storage and processing:
SQL Server 2012 2012
Source: [Larson et al]
HoBT
HoBT
Data Page
Page Header
H A[1] B[1] C[1]
H A[2] B[2] C[2]
H A[3] B[3] C[3]
H A[4] B[4] C[4]
H A[5] B[5] C[5]
Row Offset Array
Page Header
H A[6] B[6] C[6]
H A[7] B[7] C[7]
H A[8] B[8] C[8]
H A[9] B[9] C[9]
H A[10] B[10] C[10]
Row Offset Array
Page Header
H A[11] B[11] C[11]
H A[12] B[12] C[12]
H A[13] B[13] C[13]
H A[14] B[14] C[14]
H A[15] B[15] C[15]
Row Offset Array
Page Header
H A[96] B[96] C[96]
H A[97] B[97] C[97]
H A[98] B[98] C[98]
H A[99] B[99] C[99]
H A[100] B[100] C[100]
Row Offset Array
Column Store
8/8/2012
3
Column Store
Page Header
A[1] A[2] A[3] A[4]
A[5] A[6] A[7] A[8]
A[9] A[10] A[11] A[12]
A[13] A[14] A[15] A[16]
A[17] A[18] A[19] A[20]
A[21] A[22] A[23] A[24]
A[25] A[26] A[27] A[28]
Page Header
A[1] A[2] A[3] A[4]
A[5] A[6] A[7] A[8]
A[9] A[10] A[11] A[12]
A[13] A[14] A[15] A[16]
A[17] A[18] A[19] A[20]
A[21] A[22] A[23] A[24]
A[25] A[26] A[27] A[28]
Page Header
A[1] A[2] A[3] A[4]
A[5] A[6] A[7] A[8]
A[9] A[10] A[11] A[12]
A[13] A[14] A[15] A[16]
A[17] A[18] A[19] A[20]
A[21] A[22] A[23] A[24]
A[25] A[26] A[27] A[28]
Page Header
A[85] A[86] A[87] A[88]
A[89] A[90] A[91] A[92]
A[93] A[94] A[95] A[96]
A[97] A[98] A[99] A[100]
Page Header
B[1] B[2] B[3] B[4]
B[5] B[6] B[7] B[8]
B[9] B[10] B[11] B[12]
B[13] B[14] B[15] B[16]
B[17] B[18] B[19] B[20]
B[21] B[22] B[23] B[24]
B[25] B[26] B[27] B[28]
Page Header
B[1] B[2] B[3] B[4]
B[5] B[6] B[7] B[8]
B[9] B[10] B[11] B[12]
B[13] B[14] B[15] B[16]
B[17] B[18] B[19] B[20]
B[21] B[22] B[23] B[24]
B[25] B[26] B[27] B[28]
Page Header
B[1] B[2] B[3] B[4]
B[5] B[6] B[7] B[8]
B[9] B[10] B[11] B[12]
B[13] B[14] B[15] B[16]
B[17] B[18] B[19] B[20]
B[21] B[22] B[23] B[24]
B[25] B[26] B[27] B[28]
Page Header
B[85] B[86] B[87] B[88]
B[89] B[90] B[91] B[92]
B[93] B[94] B[95] B[96]
B[97] B[98] B[99] B[100]
Page Header
C[1] C[2] C[3] C[4]
C[5] C[6] C[7] C[8]
C[9] C[10] C[11] C[12]
C[13] C[14] C[15] C[16]
C[17] C[18] C[19] C[20]
C[21] C[22] C[23] C[24]
C[25] C[26] C[27] C[28]
Page Header
C[29] C[30] C[31] C[32]
C[33] C[6] C[7] C[8]
C[37] C[10] C[11] C[12]
C[41] C[14] C[15] C[16]
C[45] C[18] C[19] C[20]
C[49] C[22] C[23] C[24]
C[53] C[26] C[27] C[28]
Page Header
C[1] C[2] C[3] C[4]
C[5] C[6] C[7] C[8]
C[9] C[10] C[11] C[12]
C[13] C[14] C[15] C[16]
C[17] C[18] C[19] C[20]
C[21] C[22] C[23] C[24]
C[25] C[26] C[27] C[28]
Page Header
C[85] C[86] C[87] C[88]
C[89] C[90] C[91] C[92]
C[93] C[94] C[95] C[96]
C[97] C[98] C[99] C[100]
Forms of Storage
Forms of Storage
NSM
• N-ary Storage Model
DSM
• Decomposition Storage Model
PAX
• Partition Attributes Across (Ailamaki & DeWitt, 2001)
xVelocity
8/8/2012
4
xVelocity
xVelocity In-Memory Analytics Engine
• SQL Server Analysis Services
• PowerPivot
xVelocity Memory-Optimized Columnstore Index
• SQL Server Database Engine
xVelocity Memory-Optimized Columnstore Index
xVelocity Memory-Optimized Columnstore Index
Not an “in-memory” construct
Columns stored independently
Uses VertiPaq™ compression
Requires Enterprise Edition
Not an Index
8/8/2012
5
Not an Index
No Order
No Key
Not a bitmap index
Segment
Segment
• Aligned between columns
• Base Table Order preserved
~ 1 million rows
• Stored in one continuous BLOB
• Independently compressed
Each column
Compression
8/8/2012
6
Compression
VertiPaq™
Proprietary Not
Documented Several
Algorithms
Dictionary Encoding
Huffman Encoding
Run Length
Encoding
Lempel-Ziv-Welch
Partitioning
Partitioning
Fully supported Must be aligned
to base table
• Must include partition column
Allows for trickle load
Redirect: BLOBs
8/8/2012
7
Redirect: BLOBs
Separate Allocation Unit
Pages ↔∞∞ Values
Modified B+Tree per Value
Demo
Structure
Structure
Columns
Segments BLOBs
Demo
Dictionaries
8/8/2012
8
Dic
tio
nar
ies
Stored in Separate
BLOB
Per Partition and Column
Primary
Per multiple Segments
Secondary or Shared
Not shared between
Columns or Partitions
Creation
Creation
• (4.2 ∗ #𝐶𝑜𝑙𝑠 + 68) ∗ 𝐷𝑂𝑃 + 34 ∗ #𝑆𝑡𝑟𝑖𝑛𝑔𝐶𝑜𝑙𝑠 MBs
• Might cause Msg 701,802, 8657 or 8658
Memory requirements
• N threads -> N smaller segments
• Parallelism only for > 106 rows
Rows per segment
Cache
8/8/2012
9
Cache
New cache design
Ensures contiguous storage of segments in memory
Cached on a segment basis
Can handle free memory < index size
RBAR
RBAR
? Row By Agonizing Row
2007
Jeff Moden
Relational Operator (RelOp)
8/8/2012
10
Relational Operator (RelOp)
Clustered Index Scan
Iter
ato
r open
getRow
close
©2011 sqlity.net llc, all rights reserved. Relational Operator (RelOp)
Relational Operator (RelOp)
Columnstore Index Scan Iter
ato
r
open
getRow
getBatch
close
©2011 sqlity.net llc, all rights reserved. Batch Processing
8/8/2012
11
Batch Processing
Only on Columnstore Data
~1000 Rows
Independent Column-Vectors
Data stays Compressed
Never Serial
Batch-Advantage
Batch-Advantage
Loop unrolling
Memory prefetching
Branch prediction
Reduced cache misses
Reduced TLB misses
Batchables
8/8/2012
12
Batchables
Scan Filter Inner hash
join Batch hash table build
Local hash (partial)
aggregation
Demo
Apollo
Apollo
xVelocity m.o.c.i. Vector-based query execution
Enterprise Edition Only
Segment Elimination
8/8/2012
13
Segment Elimination
Column-Segment stores
Actual Min Value
Actual Max Value
Filter out entire Segments
Column Filter
Bitmap Filter
Demo
Limitations
Limitations No updates
(Partition switching possible)
Cannot be a clustered index
Restricted set of batch mode
operators
Restricted join operations
No filtered columnstore index
No computed columns
Not supported on [indexed] views
Only one per table
Max 1024 columns Cannot include sparse columns
Cannot enforce primary key or
unique constraint
Cannot be "ALTER INDEX"ed
Cannot "INCLUDE" columns
No sort order No seek!
No page or row compression, no vardecimal data
format
No replication, change tracking,
CDC (because read only?)
Only 21 of 36 Data Types
Situation will be improved in future
versions
8/8/2012
14
Include all columns
Put CS index on large tables (Fact & Dim)
Prefer small data
types
Favor star-joins, aggregations and grouping
Best Practices: DOs
Large (mostly) unique
string value columns
Avoid filters and joins on string
columns
Avoid OUTER
JOIN and NOT IN
UNION ALL of table with and table
without columnstore
Best Practices: DON’Ts
Literature
8/8/2012
15
Literature • [Larsen et al] Columnar Storage in SQL Server 2012 (2012, IEEE)
• Per-Ake Larson, Eric N. Hanson, Susan L. Price • [Abadi et al] Column-Stores vs. Row-Stores: How different are they really? (2008, SIGMOD)
• Daniel L. Abadi, Samuel R. Madden, Nabil Hachem • SQL Server Columnstore Index FAQ (microsoft.com) • SQL Server Columnstore Performance Tuning (microsoft.com) • [Campbell] The coming in-memory database tipping point (2012, blogs.technet.com)
• David Campbell • Perform Scalar Aggregates and still get the Benefit of Batch Processing
(microsoft.com) • Work Around Performance Issues for Columnstores Related to Strings
(microsoft.com) • Ensuring Your Data is Sorted or Nearly Sorted by Date
to Benefit from Date Range Elimination (microsoft.com) • Columnstore Indexes (msdn.microsoft.com) • [Rusanu] Inside the SQL Server 2012 Columnstore Index (2012, rusanu.com)
• Rusanu Consulting llc • Multi-Dimensional Clustering to Maximize the Benefit
of Segment Elimination (microsoft.com)
References
References
SQL Stylist with sqlity.net
Sebastian Meine
http://goo.gl/XnmRg
Sess
ion
Materials
empty