Date post: | 18-Dec-2015 |
Category: |
Documents |
Upload: | phillip-hopkins |
View: | 219 times |
Download: | 2 times |
© 2015 IBM Corporation1
In-Memory BLU Acceleration in IBM’s DB2 and dashDB:
Optimized for Modern Workloads and Hardware Architectures
Guy LohmanResearch Manager
Disruptive Information Management ArchitecturesIBM Research – Almaden
14 April 2015
© 2015 IBM Corporation
“In-Memory” BLU Acceleration: Agenda
1. Who cares about in-memory?a. In-memory is too expensive! b. In-memory is too limiting!c. In-memory is too slow for BLU!
2. What is BLU Acceleration?
3. The cloud is what is important!
4. Conclusions
2
© 2015 IBM Corporation
Moore’s Law has snookered us!
Source: http://www.jcmit.com/mem2013.htm
Main Memory
© 2013 IBM Corporation4
So, we conclude, …
Memory is:– Unlimited– Free
“It all fits!”
It must all fit!
Right?
Ergo
Ergo
© 2015 IBM Corporation5
WRONG!!!
© 2015 IBM Corporation
In-Memory is Too Expensive!
Economics + Greed There will always be a “memory hierarchy”– Yes, DRAM is getting cheaper
• Moore’s Law has not (yet) been repealed!– BUT our appetite for data is growing even faster
• The death of update-in-place (time travel)• “Big Data” Analytics craves large volumes of data
– Why pay for DRAM for cold columns?
Some (cold) data will spill to disk
• Infrequently-referenced columns
• Infrequently-referenced rows
We’ve just moved up one level in our focus…
© 2015 IBM Corporation
Focus of Memory Hierarchy Has Shifted Up 1 Level
TAPE
DRAM
DISK
CACHE
DRAM
DISK
“Disk is the new tape; Memory is the new disk.”
-- Jim Gray
© 2013 IBM Corporation8
In-Memory is Too Limiting!
DBA must choose which tables can fit in-memory– Adds database design complexity for DBA– Workloads and tables referenced change over time
Base tables aren’t the whole story! Must also include:– Indexes– Temporary tables– Materialized views– Query working space for each user (typically 1000s):
• Hash tables for joins, GROUP BY• Space for sorts• …and much more!
Have to persist anyway!(DRAM is still volatile)
© 2015 IBM Corporation9
In-Memory is Too Slow!
CPU cache is many times faster than DRAM
BLU’s run-time is carefully designed to:
Operate on compressed values, bit-aligned as vectors
Auto-detect HW cache sizes
Adapt algorithms to them: Partition data into
cache-sized blocks Exploit L2 & L3 caches Minimize cache-line
misses (to DRAM)
© 2015 IBM Corporation
New technology for accelerating BI queries• 2nd generation of Almaden’s Blink Research technology • Columnar database within DB2 for Linux, UNIX, & Windows• Run-time that is optimized to exploit modern hardware:
− Multi-core for data parallelism
− Cache and large main memories
• Operates on compressed, bit-aligned data vectors
Order-of-magnitude benefits1. Performance 2. Storage savings3. Simplicity and Time to Value!
Deeply integrated within DB2 10.5 • New columnar page format & run-time• Memory-optimized (not limited to “in-memory”)• Exploits DB2 full functionality, utilities, & tools
“Revolution via Evolution”• Easy conversion of row tables to BLU (columnar) tables• BLU tables can co-exist with traditional row tables
− In same query, schema, storage, & memory• Query any combination of BLU or row data• No need to change applications or SQL queries• DB2 run-time compensates for any missing functionality in
BLU
DB2 LUW with BLU
Run-time
DB2 Classic Run-time BLU Run-time
DB2 Classic Bufferpool
Storage
C1 C2 C3 C4 C5 C6 C7 C8C1 C2 C3 C4 C5 C6 C7 C8
BLU Encoded Columnar Tables
C1 C2 C3 C4 C5 C6 C7 C8C1 C2 C3 C4 C5 C6 C7 C8
DB2 Classic Row Tables
DB2 Compiler
What is BLU Acceleration?
© 2015 IBM Corporation
is Memory-Optimized Analytics to Accelerate Your Applications… …and Improve Your Productivity!…and is now in the Cloud, too!
Buzzwords
Memory-Optimized = In-Memory/
© 2015 IBM Corporation
Business Value 1: PERFORMANCE!Workload Speedup on Terabyte Class Data
0
10
20
30
40
50
60
Intel Large European ISV Wall Street Cognos DynamicCubes
Re
lati
ve
Pe
rfo
rma
nc
e
DB2 10.5 with BLU Accel.
DB2 10.1. 133xFaster...
44xfaster 25x
faster
18xfaster
4TB 1TB 1TB 1TB
“It was amazing to see the faster query times compared to the performance results with our row-organized tables. The performance of four of our queries improved by over 100-fold! The best outcome was a query that finished 137x faster by using BLU Acceleration.”
- Kent Collins, Database Solutions Architect, BNSF Railway
12
© 2015 IBM Corporation13
~2x-3x storage reduction vs. DB2 10.1 (comparing all objects – tables, indexes, etc.)
– Patented columnar compression techniques– Fewer storage objects (indexes, materialized views) required
DB2 with BLU Accel.DB2 with BLU Accel.
Business Value 2: Storage Savings!
Lab tests - YMMV
© 2013 IBM Corporation
Business Value 3: SIMPLICITY and Time to Value! Create, LOAD, and then… Run Queries!
– Significantly reduced or no need to tune No indexes (other than primary keys and foreign keys ) No storage reclaim (it’s automated) No memory configuration (it’s automated) No process model configuration (it’s automated) No statistics collection (it’s automated) No materialized views No statistical views No optimizer profiles or hints
BLU Acceleration automatically adapts to:– Any size RAM– Any number of CPUs and cores– Any number of disks or SSDs
“The BLU Acceleration technology has some obvious benefits: … But it’s when I think about all the things I don't have to do with BLU, it made me appreciate
the technology even more: no tuning, no partitioning, no indexes, no aggregates.”
-Andrew Juarez, Lead SAP Basis and DBA
14
© 2015 IBM Corporation
What About Transactions?
BLU tables may be updated with UPDATE, DELETE, and INSERT commands Changes made directly to BLU (column-organized) tables
– No row-organized staging tables, unlike SAP HANA and SQL Server Multi-versioning – no in-place updates! Maintains DB2’s usual Isolation, Concurrency Control, and Durability
– Fully logged, so recoverable– Supported:
• Isolation Levels: CS + CC, UR• Searched UPDATE & DELETE• INSERT from VALUES, INSERT from sub-select, MERGE
– Not supported:• Positioned update & delete (by cursor), select-from-UDI, update & delete of UNION views
Insert speed on par with row-organized tables– Sometimes faster, because much fewer (or no) indexes– Best performance for large transactions, to amortize logging overheads
• INSERTing or UPDATEing 100s or 1000s of rows, or more• DELETEing, if the clustering of pages matches that of the DELETE (e.g., time)
New values compressed with page-level dictionaries, if beneficial– In addition to (on top of) column-level dictionary
© 2013 IBM Corporation
The cloud is what’s important!
Introducing IBM dashDB!
Fully managed service in the cloud using IBM BlueMix Cloudant
JSON ready! Tightly integrated with Cloudant, providing analytics on JSON data
Or, import data from Excel or CSV files In-database analytics
Statistical analysis with R Spatial analytics with Esri.
• In under an hour, anyone can access awesome data warehousing and BI using BLU Acceleration
• No infrastructure or IT resources required• Visit: http://dashDB.com
BLUi nsi de BLUi nsi de
© 2013 IBM Corporation
dashDB – Use R Studio for Predictive Analytics
© 2015 IBM Corporation
Conclusions
In-memory is: Too expensive! Too limiting! Too slow!
DB2’s BLU Acceleration columnar run-time: Exploits cache, large memories, & multi-core parallelism Provides
– >10x faster BI querying… and transactions, too!– 10x storage savings– Simpler, much less tuning:
• No secondary indexes or MVs to choose• Automated stats collection, WLM, etc.• More predictable and reliable performance• Adapts automatically to your hardware
Now available in the cloud as IBM dashDB: DB2 BLU + Cloudant + R For more details on BLU:
– V. Raman et al., “DB2 with BLU Acceleration: So Much More than a Column Store”, VLDB 2013
18
© 2015 IBM Corporation
MerciGrazie
Obrigado
Japanese
Thank You
English
French
Russian
DankeGerman
Italian
Gracias
Spanish
Brazilian Portuguese
Arabic
Simplified Chinese
Traditional Chinese
Hindi
Tamil
Thai
Korean
Greek
© 2015 IBM Corporation
BACKUP
© 2015 IBM Corporation
“Related Work”
1995 2000 2005 20152010
SybaseIQ IQ
P*TIMEMaxDB
TREX
HANA
Ingres
C-Store
BlinkDB2 BLU
IWA
ISAO
MonetDB(CWI)
X100 Vectorwise
Data Distilleries
SPSS
© 2015 IBM Corporation
Frequency (Dictionary) Compression
NOTE: Within each partition, dictionary codes are: Fixed in length! Order-preserving!
Prod Origin
Sales (Volume, Product, Origin)
CommonValues
Rare values
Nu
mb
er o
f O
ccu
rren
ces Histogram
on OriginChinaUSA
GER,FRA,
… Rest
Column Partitions
Vol
Dictionary for Origin
0 = CN1 = US
000 = BR001 = FR010 = GE011 = IN… 111 = UK
00000000 = AU00000001 = CA…
Partition 1 (1 bit)
Partition 2 (3 bits)
Partition 3 (8 bits)
© 2015 IBM Corporation
Frequency compression (approximate Huffman encoding) exploits skew – The more frequent the value, the fewer bits it is encoded with– For example, typically a few populous states may dominate the number of
sales• New York and California may be encoded with only 1 or 2 bits• Alaska and Rhode Island may be encoded in 6 bits
Perform SQL Operations on the Encoded Data!– Apply predicates (=, <, >, >=, <=, <>, BETWEEN, IN, etc.)– Perform joins & grouping
Encoded data is smaller, uses less machine resources– Encoded values packed together densely in register-width chunks– Fewer I/Os, better memory & cache utilization, fewer CPU cycles to process
7 Big Ideas: Operate on Compressed Values2
Register Length
STATE EncodingNew YorkCaliforniaIllinoisMichiga
nAlaskaRhode Isl
Florida
ConceptualCompression
Dictionary
23
© 2015 IBM Corporation
7 Big Ideas: Core- & Cache-Friendly Parallelism4
BLU’s legacy: main-memory DBMS
BLU’s run-time was built from the ground up to automatically: Exploit multi-core parallelism within queries Minimize sharing of common data structures, to minimize latching Pay careful attention to physical attributes of the server, e.g. cache sizes
Maximize CPU cache hit rate & cache-line efficiency
24
cacheline
core core
cache cache
core 0 working on blue data
core 1 working on green data
Cacheline ‘ping-pong’
core
cache
core
cache
MinimalTraffic
© 2015 IBM Corporation
Joins
25
Scan & Apply Local Predicates
Thread 1
P2
P3
P4
P1Thread A HT 1
Thread B HT 2
Thread C HT 3
Thread D HT 4
Load Join Column(s),
Re-encode, & Build Join Filter
Load Payloads
Dim
ensi
on T
able
(s)
[Build Phase]
Partition
Scan &Apply Local Predicates
Load Join Column(s),
Re-encode, & Build Join Filter
Load Payloads Partition
Thread 2
CompactedHash Tables
BLU supports all – SQL join types (inner-, LEFT OUTER, RIGHT OUTER, ANTI-, …) – Data types (VARCHARs, trailing blanks,…)
No assumption that anything fits in memory, including inners– Partition to fit in L3 cache, if memory-resident– Else first partition to fit in memory
Novel compacted hash table for cache-mostly processing
Join with
Dim2
P1Lookup HT
1
P2Lookup HT
2
P3Lookup HT
3
P4Lookup HT
4
Result payloads
Fact
Tab
le
[Probe Phase]
Scan & Apply Local Predicates
Load Join Column FK1
Load Join Column FK2
Apply Join Filter on FK1
Apply Join Filter on FK2
Partition a stride
De-partition Dim1
payload(s)
CompactedHash Tables
© 2015 IBM Corporation
Group By / Aggregation
Need to perform well on queries that output from few tens to billions of groups
Cache- and NUMA*-aware (* Non-Uniform Memory Architecture)
26
Threads
Encoded keysUnencoded
keys
WorkUnit
Global lists of Overflow Blocks(1 per partition)[Phase 2] Final partition merging
Glo
bal p
artiti
oned
HTs
Local HTs, fixed size(1 per thread)
[Phase 1] Local Hash Table (HT) probes and appends to Overflow Buckets (OBs)
Overflow Buckets (OBs)[P1] Append overflow groups
[P1] Publish OBs[P2] Merge Local HTs
[P1] Probe local HT
[P2] Merge OBs
© 2015 IBM Corporation27
Database Design and Tuning
1. Decide on partition strategies 2. Select Compression Strategy3. Create Table4. Load data5. Create Auxiliary Performance Structures
• Materialized views• Create indexes
• B+ indexes• Bitmap indexes
6. Tune memory7. Tune I/O8. Statistics collection9. Add Optimizer hints
Repeat
DB2 with BLU Acceleration
1. Create Table2. Load data
Business Value 3: SIMPLICITY and Time to Value!
“Super Fast, Super Easy” – Just Create, Load, and Go!
© 2013 IBM Corporation
“Super Fast, Super Easy” – Just Create, Load, and Go!Create
• Single parameter to configure entire database for BLU:db2set DB2_WORKLOAD=ANALYTICS
• Create the database, table spaces, bufferpools, and tables• Tip: Useful to define “mem_percent”
db2 “create database mydb autoconfigure using mem_percent 95
apply db and dbm”db2 “create table mytable (c1 integer not null, …)”
Load your data• Same as before - no new syntax!
db2 “load from file.dat of del replace into mytable”
Go!• Begin running your workload
db2 “select SUM(SALES) from mytable where PURCHASEDATE > ‘20140101’ group by CITY”
28
© 2013 IBM Corporation
© 2013 IBM Corporation
Cloudant – Create a dashDB Warehouse
© 2013 IBM Corporation
dashDB Welcome PageAutomatic schema discovery, analyzes
your JSON data in Cloudant, then discovers and automatically creates a
relational schema for dashDB.
© 2013 IBM Corporation
dashDB – Load data from CVS, or Excel
© 2013 IBM Corporation
dashDB – Getting Started
© 2013 IBM Corporation
Shadow Tables for Mixed Workloads
Sales
• Faster OLTP – fewer indexes•Dramatic reduction in indexes on the row table
• Faster Reporting – BLU Acceleration!
•10X-40X faster.
• Dual representation. Data stored as both row and column. The best of both worlds.
• No application change. Database query compiler decides which format to access. Fully automated.
• Small memory needs.
Row-organized
Column-organized
BLUi nsi de BLUi nsi de
© 2013 IBM Corporation
OLTP Workload
OLAP Reporting
Log
CDC Capture and Apply Engine
DB2
Change Data Capture
Shadow Tables Architecture
Server
IBM InfoSphere Change Data Capture (CDC) included in DB2 AWSE and AESE (for shadow table usage)
SYSTOOLS.REPL_MQT_LATENCY
Optimizer
SET CURRENT REFRESH AGE … ;
Optimizer can route queries to shadow tables if data is not older than the desired
refresh age.