Date post: | 19-Dec-2015 |
Category: |
Documents |
Upload: | susan-collins |
View: | 245 times |
Download: | 0 times |
Operational Analytics in SQL ServerSunil AgarwalPrincipal Program Manager, SQL [email protected]
BRK4552
Definition and Value Prop Operational Analytics with Disk-Based
Tables Operational Analytics with In-Memory OLTP
Agenda
Refers to Operational Workload (i.e. OLTP)
Examples: Enterprise Resource Planning (ERP) – Inventory, Order, Sales, Machine Data – Data from machine operations on factory floor Online Stores (e.g. Amazon, Expedia) Stock/Security trades
Mission Critical No downtime (High Availability) – impact on revenue Low latency and high transaction throughput
What is Operational?
Analytics Studying past data (e.g. operational, social media) to identify potential
trends To analyze the effects of certain decisions or events (e.g. Ad campaign) Analyze past/current data to predict outcomes (e.g. credit score)
Goals Enhance the business by gaining knowledge to make improvements or
changes.
What is Analytics?
Source – MIT/SLOAN Management Review
Traditional Operational/Analytics Architecture
SQL Server
Database
Application Tier
Presentation Layer
IIS Server
SQL ServerRelational DW
Database
ETL
BI and analytics
SQL ServerAnalysis Server
Key Issues Complex
Implementation Requires two Servers
(CapEx and OpEx) Data Latency in
Analytics More businesses
demand/require real-time Analytics
Hourly, Daily, Weekly
Minimizing Data Latency for Analytics
SQL Server
Database
Application Tier
Presentation Layer
IIS Server
BI and analytics
Benefits No Data Latency No ETL No Separate DW
Challenges Analytics queries are resource
intensive and can cause blocking
How to minimize Impact on Operational workload
Sub-optimal execution of Analytics on relational schema
Add analytics specific indexes
This is OPERATIONAL ANALYTICS
SQL ServerAnalysis Server
8
Operational Analytics Ability to run Analytics Queries concurrently with
Operational workload using the same schema.
Not a replacement for• Extreme Analytics Queries performance possible using schemas customized (e.g. Star/Snowflake) and pre-aggregated cubes• Data coming from non-relational sources• Data coming from multiple relational sources requiring integrated analytics
Goals Minimal impact on Operational Workload with
concurrent analytics Performant Analytics on operational schema
SQL Server 2016: Operational Analytics
Achieved using columnstore Index
11
Columnstore Index: Why?
Improved compression:Data from same domain
compress better
Reduced I/O:
Fetch only columns needed
…
Data stored as rows Data stored as columns
Ideal for OLTP Efficient operation on small set of rows
C1 C2 C3 C5C4
Improved Performance:More data fits in memoryOptimized for CPU utilization
Ideal for DW Workload
14
Operational Analytics: With Columnstore Index
Key Points• Create an updateable non-clustered columnstore index (NCCI) for
analytics queries• Drop all other indexes that were created for analytics. • No Application changes. • ColumnStore index is maintained just like any other index• Query Optimizer will choose columnstore index where needed
Relational Table(Clustered Index/Heap)
Btree IndexD
ele
te b
itm
ap
Nonclustered columnstore index (NCCI)
Delta rowgroups
15
Operational Analytics: Columnstore Index OverheadDML Operations on OLTP workload
Operation BTREE (NCI) Non Clustered ColumnStore Index (NCCI)
Insert Insert row into btree Insert row into btree (delta store)
Delete (a)Seek row(s) to be deleted(b)Delete the row
(a)Seek for the row in the delta stores (there can be multiple)(b)If row found, then delete(c) Otherwise insert the key into delete row
buffer
Update (a)Seek the row(s) (b)Update
(a)Delete the row (steps same as above)(b)Insert the updated row into delta store
16
Operational Analytics: Minimizing Columnstore overhead
Key Points• Create Columnstore only on cold data – using filtered predicate to minimize
maintenance• Analytics query accesses both columnstore and ‘hot’ data transparently• Example – Order Management Application – create nonclustered columnstore index ….. where order_status = ‘SHIPPED’
Relational Table(Clustered Index/Heap)
Btree Index
Dele
te b
itm
ap
Nonclustered columnstore index (NCCI) – filtered index
HOT
Delta rowgroups
DML Operations
17
Operational Analytics: Minimizing Columnstore overhead
Key Points Mission Critical Operational Workloads typically configured for High
Availability using AlwaysOn Availability Groups You can offload analytics to readable secondary replica
PrimaryReplica
Log records
Log records
Log records
Secondary Replica
Secondary Replica
Secondary Replica
Analytic Queries AlwaysOn Availability Group
20
Operational Analytics: Columnstore on In-Memory Tables
SQL Server 2016 – CTP2 limitation You can create columnstore index on empty
table All columns must be included in the
columnstore
No explicit delta rowgroup Rows (tail) not in columnstore stay in in-memory
OLTP table No columnstore index overhead when operating on
tail Background task migrates rows from tail to
columnstore in chunks of 1 million rows not changed in last 1 hour.
Deleted Rows Table (DRT) – Tracks deleted rows
Columnstore data fully resident in memory Persisted together with operational data
No application changes required.
In-Memory OLTP Table
Updateable CCI
DRT Tail
Range Index
Hash Index
Hot
Like
Delta rowgroup
21
Operational Analytics: Columnstore OverheadDML Operations on In-Memory OLTP
Operation Hash or Range Index HK-CCI
Insert Insert row into HK Insert row into HK
Delete (a)Seek row(s) to be deleted(b)Delete the row
(a)Seek row(s) to be deleted(b)Delete the row in HK(c) If row in TAIL then return else insert <colstore-RID> into DRT
Update (a)Seek the row(s) (b)Update (delete/insert)
(a)Seek the row(s) (b)Update (delete/insert) in HK(c) If row in TAIL then return else insert <colstore-RID> into DRT
22
Operational Analytics: Minimizing Columnstore overheadDML Operations
In-Memory OLTP Table
Updateable CCI
DRT Tail
Range Index
Hash Index
Like
Delta rowgroup
Hot
Keep hot data only in in-memory tablesExample – data stays hot for 1 day, 1 week…
CTP2: Work-AroundUse TF – 9975 to disable auto-compressionForce compression using a spec-proc “sp_memory_optimized_cs_migration”
Analytics QueriesOffload Analytics to AlwaysON Readable Secondary
23
Summary – Operational Analytics
Analytics in real-time with no data latency Rich set of options to control impact on
Operational workload Industry leading solution Integrating in-
memory OLTP with in-memory Analytics No Application changes required
Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.
Please evaluate this sessionYour feedback is important to us!
26
Operational Analytics: with CCI
CCI
Btree Index
delta
HOT DATA
SQL V.Next Allows creating one or more NCIs on CCI Allows locking @rowlevel for updates/deletes Ability to delay compression of rows in delta rowgroup
• Some Limitations No Triggers No transaction replication Cursor based access not allowed
• Comparison with NCCI Seek of the row in the compressed store is comparatively
expensive Short-range scans comparatively expensive Lower concurrency: when rowgroup is compressed (TM),
it is not available for UPDATE/DELETE