Agenda• Vertica VS the world
• What is Vertica
• How does it work
• How To Use Vertica … (The Right Way )
• Where It Falls Short
• Drill Down to SQL’s… (Group by & Joins )
Close Your Eyes
Imagine Your System
It Needs To support:
• 1,000,000 concurrent users
• 1,000,000 operations/s
• Micro seconds read & write latency
• Complex analytics queries with seconds
latency
• ACID
Highly Avilable
Scalable
Open Your Eyes
What Do You See ?
Vertica
OracleCouchbase
Cassandra
MongoMySql
Exadata
Vertica VS the WorldVertica Oracle Cassandra Couchbase
Scale Mpp Single Server* Mpp Mpp
Data Model Relational structured
Relational structured
Column store schema-less
Document schema-less
Transaction Model
ACID ACID Eventually consistent
Consistent
Dr Application solution
Stand by read only
Active Active Active Active
Development Sql… Sql… Python,Java,Cql…
Python,Java,Php…
Best for Analytics Generic,OLTP Write intensive key value
Read and write intensive json
documents
CAP CP N/A AP CP
Use Cases
• Real time dashborading (5,000 concurrent
users, heavy writes and simple fetches ).
• Real time complex analytics
• Billing
• Blog Site
Cassandra
Vertica
Oracle
Couchbase
MPP-Columnar DBMS
• 10x –100x performance of classic RDBMS
• Linear Scale
• SQL
• Commodity Hardware
• Built-in fault tolerance
10x –100x performance of classic RDBMS
Column store architecture
• High Compression rates.• Sorted columns.• Objects Segmentation/Replication.
Regular table
Continent Country City Size Size type Population
Asia Israel Tel Aviv 52000 Acres 450000
N.America USA Dallas 385 Sq. miles 1200000
Create Table …..
Rows Vs ColumnContinent
• Asia
• Asia
• Asia
• N.America
• N.America
• N.America
Country
• Israel
• Israel
• Israel
• Usa
• Usa
• usa
Size Type
• Sq. miles
• Sq. miles
• Sq. miles
• Sq miles
• Sq. miles
• Sq. miles
City size
• 52000
• 78000
• 63000
• 385
• 468
• 8700
City Name
• Tel Aviv
• Jerusalem
• Haifa
• Dallas
• New York
• New Jersey
Population
• 450000
• 800000
• 268000
• 1200000
• 8200000
• 8800000
Block1
•Asia•Israel•Sq. miles•Tel Aviv
Block2
•52000•450000•Asia
Block3
•Israel•Sq. miles•Jerusalem
Block4
•78000•800000•N.America
Block 5
•Usa•Dallas•Sq. miles•385
Block 6
•1200000•Asia•Israel
Block 7
•Haifa•Sq. miles•63000
Block 8
•268000•N.America•Usa
Block 9
•New York•Sq. miles•468•8200000
Block1
•Asia•Israel•Sq. miles•Tel Aviv
Block2
•52000•450000•Asia
Block3
•Israel•Sq. miles•Jerusalem
Block4
•78000•800000•N.America
Block 5
•Usa•Dallas•Sq. miles•385
Block 6
•1200000•Asia•Israel
Block 7
•Haifa•Sq. miles•63000
Block 8
•268000•N.America•Usa
Block 9
•New York•Sq. miles•468•8200000
Block1
•Asia•Israel•Sq. miles•Tel Aviv
Block2
•52000•450000•Asia
Block3
•Israel•Sq. miles•Jerusalem
Block4
•78000•800000•N.America
Block 5
•Usa•Dallas•Sq. miles•385
Block 6
•1200000•Asia•Israel
Block 7
•Haifa•Sq. miles•63000
Block 8
•268000•N.America•Usa
Block 9
•New York•Sq. miles•468•8200000
Block1
•Asia•Israel•Sq. miles•Tel Aviv
Block2
•52000•450000•Asia
Block3
•Israel•Sq. miles•Jerusalem
Block4
•78000•800000•N.America
Block 5
•Usa•Dallas•Sq. miles•385
Block 6
•1200000•Asia•Israel
Block 7
•Haifa•Sq. miles•63000
Block 8
•268000•America•Usa
Block 9
•New York•Sq. miles•468•8200000
Block1
•Asia•Israel•Sq. miles•Tel Aviv
Block2
•52000•450000•Asia
Block3
•Israel•Sq. miles•Jerusalem
Block4
•78000•800000•N.America
Block 5
•Usa•Dallas•Sq. miles•385
Block 6
•1200000•Asia•Israel
Block 7
•Haifa•Sq. miles•63000
Block 8
•268000•N.America•Usa
Block 9
•New York•Sq. miles•468•8200000
Continent
•Asia,3N.America,3
RLE Encoding
Country
•Israel,3Usa,3
RLE Encoding
Size Type
•Dunam,3Sq. miles,3
RLE Encoding
City size
•5200078000630003854688700
DeltaVal Encoding
City Name
•Tel AvivJerusalemHaifaDallasNew YorkNew Jersey
RLE Encoding
Population
•450000800000268000120000082000008800000
LZO Encoding
Rows VS Columns
• Conversion Table (~2 billion rows a month)–Oracle •Uncompressed => 418 GB • Compressed (manual) => 147 GB
–Vertica• 21 GB
Saving : 71%
How Does It Work ?
Tuple Mover
ROSAsia,23
N.America,13
Israel,23
Usa,13
Natanya,1Zoran,1…
seattle,1Chicago,1Austin,1…
Asia,2
N.America, 3
Israel,2
Usa,1
Jerusalem,1Tel aviv,1…
Dallas,1New Jersey,1New York,1…
WOS
Tuple Mover Flow
N.America Usa Dallas Sq. miles 385 1200000
Asia Israel Tel Aviv Sq. miles 52000 450000
N.America Usa New York Sq. miles 462 8200000
N.America Usa New Jersey Sq. miles 468 8800000
Asia Israel Jerusalem Sq. miles 78000 800000
Asia,25
N.America,16
Israel,25
Usa,16
Jerusalem,1Natanya,1Tel Aviv,1Zoran,1…Austin,1Chicago,1Dallas,1New Jersey,1New York,1seattle,1…
Projections
• Physical structure of the table (logical)• Stored sorted and compressed • Internal maintenance • At least one (super) projection• Projection Types:– Super projection– Query specific projection– Pre join projection– Buddy projection
Projections
How to build my projections ?
• Use DBD• Choose the right columns (General Vs Specific)• Choose the right sort order • Choose the right encoding • Choose the right column to partition by • Choose the right column to segment by
Rule of thumbs(Don’t tell Tom Kyte)
• Avoid “select * …”• De normalize• Use bulks for DML’s • Use merge join for large joins. • Understand Vertica architecture &
your data
Delete/Update
• Deleted rows are only marked as deleted• Stored in delete vector on disk• Query merge the ROS and Deleted vector to
remove deleted records• Data is removed asynchronously during merge
out
Delete/UpdateStrata issue
Merge OutToo Many ROS
500MB
2GB
4GB
Where It Falls Short …
• Lack of Features • Documentation • Good for specific types of queries
Let’s Dive into Sql Examples
1. Sort Optimization2. Join Optimization
Choose the Right sort order Example
select a11.LP_ACCOUNT_ID AS LP_ACCOUNT_ID, count(distinct a11.VS_LP_SESSION_ID) AS Visits, (count(distinct a11.VS_LP_SESSION_ID) * 1.0) AS WJXBFS1 from lp_15744040.FACT_VISIT_ROOM a11 group by a11.LP_ACCOUNT_ID;
First projection ….table_name projection_name projection_column_name column_position sort_position
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad VS_LP_SESSION_ID 0 0
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad LP_ACCOUNT_ID 1 1
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad VS_LP_VISITOR_ID 2 2
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad VISIT_FROM_DT_TRUNC 3 3
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad ACCOUNT_ID 4 4
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad ROOM_ID 5 5
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad VISIT_FROM_DT_ACTUAL 6 6
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad VISIT_TO_DT_ACTUAL 7 7
FACT_VISIT_ROOM FACT_VISIT_ROOM_bad HOT_LEAD_IND 8 8
Access Path: +-GROUPBY PIPELINED [Cost: 7M, Rows: 10K] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a11.LP_ACCOUNT_ID | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 7M, Rows: 10K] (PATH ID: 2) | | Group By: a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | +---> STORAGE ACCESS for a11 [Cost: 5M, Rows: 199M] (PATH ID: 3) | | | Projection: lp_15744040.FACT_VISIT_ROOM_bad | | | Materialize: a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID
Second projection …table_name projection_name projection_column_name column_position sort_position
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 LP_ACCOUNT_ID 0 0
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 VS_LP_SESSION_ID 1 1
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 VS_LP_VISITOR_ID 2 2
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 VISIT_FROM_DT_TRUNC 3 3
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 ACCOUNT_ID 4 4
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 ROOM_ID 5 5
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 VISIT_FROM_DT_ACTUAL 6 6
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 VISIT_TO_DT_ACTUAL 7 7
FACT_VISIT_ROOM FACT_VISIT_ROOM_fix1 HOT_LEAD_IND 8 8
Access Path: +-GROUPBY PIPELINED [Cost: 7M, Rows: 10K] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a11.LP_ACCOUNT_ID | +---> GROUPBY PIPELINED [Cost: 7M, Rows: 10K] (PATH ID: 2) | | Group By: a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | +---> STORAGE ACCESS for a11 [Cost: 5M, Rows: 199M] (PATH ID: 3) | | | Projection: lp_15744040.FACT_VISIT_ROOM_fix1 | | | Materialize: a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID
Results …
Elapsed Time First projectionGROUPBY HASH (SORT OUTPUT)
Time: First fetch (7 rows): 264527.916 ms. All rows formatted: 264527.978 ms
Elapsed Time Second projectionGROUPBY PIPELINED
Time: First fetch (7 rows): 38913.909 ms. All rows formatted: 38913.965 ms
2
Group by Hash Not Sorted
Value Count
111
CBAD
12222
Group By Pipe OperatorSorted
Count( ) =
Join Exampleselect a12.DT_WEEK AS DT_WEEK, a11.LP_ACCOUNT_ID AS LP_ACCOUNT_ID, count(distinct a11.VS_LP_SESSION_ID) AS Visits, (count(distinct a11.VS_LP_SESSION_ID) * 1.0) AS WJXBFS1 from zzz.FACT_VISIT a11 join zzz.DIM_DATE_TIME a12 on (a11.VISIT_FROM_DT_TRUNC = a12.DATE_TIME_ID) where (a11.LP_ACCOUNT_ID in ('57386690') and a11.VISIT_FROM_DT_TRUNC between '2011-09-01 15:28:00' and '2011-12-31 12:52:50') group by a12.DT_WEEK, a11.LP_ACCOUNT_ID
Filter : LP_ACCOUNT_ID, VISIT_FROM_DT_TRUNC Group By : DT_WEEK , LP_ACCOUNT_ID Join: VISIT_FROM_DT_TRUNC , DATE_TIME_ID Select : DT_WEEK, LP_ACCOUNT_ID, VS_LP_SESSION_ID
Full Explain Plan…Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 14M, Rows: 5M (NO STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a12.DT_WEEK, a11.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 6M, Rows: 100M (NO STATISTICS)] (PATH ID: 2) | | Group By: a12.DT_WEEK, a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | Execute on: All Nodes | | +---> JOIN HASH [Cost: 944K, Rows: 372M (NO STATISTICS)] (PATH ID: 3) | | | Join Cond: (a11.VISIT_FROM_DT_TRUNC = a12.DATE_TIME_ID) | | | Materialize at Output: a11.VS_LP_SESSION_ID, a11.LP_ACCOUNT_ID | | | Execute on: All Nodes | | | +-- Outer -> STORAGE ACCESS for a11 [Cost: 421K, Rows: 372M (NO STATISTICS)] (PATH ID: 4) | | | | Projection: zzz.FACT_VISIT_b0 | | | | Materialize: a11.VISIT_FROM_DT_TRUNC | | | | Filter: (a11.LP_ACCOUNT_ID = '57386690') | | | | Filter: ((a11.VISIT_FROM_DT_TRUNC >= '2011-09-01 15:28:00'::timestamp) AND (a11.VISIT_FROM_DT_TRUNC <= '2011-12-31 12:52:50'::timestamp)) | | | | Execute on: All Nodes | | | +-- Inner -> STORAGE ACCESS for a12 [Cost: 1K, Rows: 10K (NO STATISTICS)] (PATH ID: 5) | | | | Projection: zzz.DIM_DATE_TIME_node0004 | | | | Materialize: a12.DATE_TIME_ID, a12.DT_WEEK | | | | Filter: ((a12.DATE_TIME_ID >= '2011-09-01 15:28:00'::timestamp) AND (a12.DATE_TIME_ID <= '2011-12-31 12:52:50'::timestamp)) | | | | Execute on: All Nodes
Explain Plan (substract)…Access Path:l +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 14M, Rows: 5M (NO STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a12.DT_WEEK, a11.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 6M, Rows: 100M (NO STATISTICS)] (PATH ID: 2) | | Group By: a12.DT_WEEK, a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | Execute on: All Nodes | | +---> JOIN HASH [Cost: 944K, Rows: 372M (NO STATISlTICS)] (PATH ID: 3) | | | Join Cond: (a11.VISIT_FROM_DT_TRUNC = a12.DATE_TIME_ID) | | | Materialize at Output: a11.VS_LP_SESSION_ID, a11.LP_ACCOUNT_ID | | | Execute on: All Nodes
Time: First fetch (6 rows): 56654.894 ms. All rows formatted: 56654.988 ms
Solution one - Functionsselect week(a11.VISIT_FROM_DT_TRUNC) AS DT_WEEK, a11.LP_ACCOUNT_ID AS LP_ACCOUNT_ID, count(distinct a11.VS_LP_SESSION_ID) AS Visits, (count(distinct a11.VS_LP_SESSION_ID) * 1.0) AS WJXBFS1 from zzz.FACT_VISIT a11 where (a11.LP_ACCOUNT_ID in ('57386690') and a11.VISIT_FROM_DT_TRUNC between '2011-09-01 15:28:00' and '2011-12-31 12:52:50') group by week(a11.VISIT_FROM_DT_TRUNC), a11.LP_ACCOUNT_ID;
Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 127, Rows: 1 (STALE STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: <SVAR>, a11.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 126, Rows: 1 (STALE STATISTICS)] (PATH ID: 2) | | Group By: (date_part('week', a11.VISIT_FROM_DT_TRUNC))::int, a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | Execute on: All Nodes | | +---> STORAGE ACCESS for a11 [Cost: 125, Rows: 1 (STALE STATISTICS)] (PATH ID: 3) | | | Projection: zzz.FACT_VISIT_b0 Time: First fetch (6 rows): 33453.997 ms. All rows formatted: 33454.154 ms
Saved the Join Time
Solution Two- PreJoin Projection
Pros• Eliminate Join overhead• Maintain By Vertica
Cons• Not Flexible• Cause Overhead on Load• Need Primary/Foreign Key• Maintenance Restrictions
Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 12K, Rows: 10K] (PATH ID: 1) | Aggregates: count(DISTINCT visit_date_time_prejoin8_b0.VS_LP_SESSION_ID) | Group By: visit_date_time_prejoin8_b0.DT_WEEK, visit_date_time_prejoin8_b0.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 11K, Rows: 10K] (PATH ID: 2) | | Group By: visit_date_time_prejoin8_b0.DT_WEEK, visit_date_time_prejoin8_b0.LP_ACCOUNT_ID, visit_date_time_prejoin8_b0.VS_LP_SESSION_ID | | Execute on: All Nodes | | +---> STORAGE ACCESS for <No Alias> [Cost: 8K, Rows: 1M] (PATH ID: 3) | | | Projection: lp_15744040.visit_date_time_prejoin8_b0
Solution Two- PreJoin Projectionorder by LP_ACCOUNT_ID,VISIT_FROM_DT_TRUNC,DT_WEEK,HOT_LEAD_IND,DATE_TIME_ID,VS_LP_SESSION_ID
Time: First fetch (6 rows): 35312.331 ms. All rows formatted: 35312.421 msSaved the Join Time
Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 542K, Rows: 10K] (PATH ID: 1) | Aggregates: count(DISTINCT visit_date_time_prejoin_z6.VS_LP_SESSION_ID) | Group By: visit_date_time_prejoin_z6.DT_WEEK, visit_date_time_prejoin_z6.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY PIPELINED [Cost: 542K, Rows: 10K] (PATH ID: 2) | | Group By: visit_date_time_prejoin_z6.DT_WEEK, visit_date_time_prejoin_z6.VS_LP_SESSION_ID, visit_date_time_prejoin_z6.LP_ACCOUNT_ID | | Execute on: All Nodes | | +---> STORAGE ACCESS for <No Alias> [Cost: 501K, Rows: 15M] (PATH ID: 3) | | | Projection: lp_15744040.visit_date_time_prejoin_z6 | |
Solution Two- PreJoin ProjectionSorted By DT_WEEK, LP_ACCOUNT_ID, VS_LP_SESSION_ID
Time: First fetch (6 rows): 3680.853 ms. All rows formatted: 3680.969 msSaved the Join Time and Group by hash Time
Solution Three - Denormalizeselect DT_WEEK, a11.LP_ACCOUNT_ID AS LP_ACCOUNT_ID, count(distinct a11.VS_LP_SESSION_ID) AS Visits, (count(distinct a11.VS_LP_SESSION_ID) * 1.0) AS WJXBFS1 from zzz.FACT_VISIT_Z1 a11 where (a11.LP_ACCOUNT_ID in ('57386690') and a11.VISIT_FROM_DT_TRUNC between '2011-09-01 15:28:00' and '2011-12-31 12:52:50') group by DT_WEEK, a11.LP_ACCOUNT_ID;
Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 3M, Rows: 10K (NO STATISTICS)] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a11.DT_WEEK, a11.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY HASH (SORT OUTPUT) [Cost: 3M, Rows: 10K (NO STATISTICS)] (PATH ID: 2) | | Group By: a11.DT_WEEK, a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | Execute on: All Nodes | | +---> STORAGE ACCESS for a11 [Cost: 2M, Rows: 372M (NO STATISTICS)] (PATH ID: 3) | | | Projection: zzz.FACT_VISIT_Z1_superTime: First etch (6 rows): 33885.178 ms. All rows formatted: 33885.253 ms
Saved the Join Time
• Changing the projection sort order
Solution Three - Denormalize
Access Path: +-GROUPBY PIPELINED (RESEGMENT GROUPS) [Cost: 588K, Rows: 10K] (PATH ID: 1) | Aggregates: count(DISTINCT a11.VS_LP_SESSION_ID) | Group By: a11.DT_WEEK, a11.LP_ACCOUNT_ID | Execute on: All Nodes | +---> GROUPBY PIPELINED [Cost: 587K, Rows: 10K] (PATH ID: 2) | | Group By: a11.DT_WEEK, a11.VS_LP_SESSION_ID, a11.LP_ACCOUNT_ID | | Execute on: All Nodes | | +---> STORAGE ACCESS for a11 [Cost: 531K, Rows: 20M] (PATH ID: 3) | | | Projection: zzz.fact_visit_z1_pipe | | | Materialize: a11.DT_WEEK, a11.LP_ACCOUNT_ID, a11.VS_LP_SESSION_ID | | | Filter: (a11.LP_ACCOUNT_ID = '57386690') | | | Filter: ((a11.VISIT_FROM_DT_TRUNC >= '2011-09-01 15:28:00'::timestamp) AND (a11.VISIT_FROM_DT_TRUNC <= '2011-12-31 12:52:50'::timestamp)) | | | Execute on: All Nodes
Time: First fetch (6 rows): 4313.497 ms. All rows formatted: 4313.600 msSaved the Join Time and Group by hash Time
Keep it simple.Keep it sorted.*** Keep it joinless.
Let’s sum it up…
Questions ?
Thank You