Date post: | 10-Apr-2018 |
Category: |
Documents |
Upload: | dumbrava-caius-florin |
View: | 215 times |
Download: | 0 times |
of 29
8/8/2019 Nasdaq 2
1/29
1
8/8/2019 Nasdaq 2
2/29
NASDAQ: Partitioning at Work
Cost-Efficient Data Management Through Database
Consolidation and ILM
Dr Lilian Hobbs - ILM Product Manager
8/8/2019 Nasdaq 2
3/29
3
NASDAQ the Company
Largest electronic screen-based equity security market
in the US More than 3,300 listed companies
Technology, financial services, retail, communications,
transportation, media, and biotech
On average, highest share trading in US
On a typical day, NASDAQ processes
22 million quotes
9 million trades
45 million orders
8/8/2019 Nasdaq 2
4/294
NASDAQ Business Objective
Preserve the market data and provide access in atimely manner
Stores historical data back to 1995
Transactional granularity required Adherence to SEC regulations and reporting
Internal and external clients National Association of Securities (NASD) dealers is biggest
'external' consumer
They regulate the Nasdaq Brokers/dealers
Investors
8/8/2019 Nasdaq 2
5/29
8/8/2019 Nasdaq 2
6/296
The Databases
MDSS
Staging Database
This is not partitioned
MDSP
Permanent Database
Monthly & Non-Monthly Tables
Partitioned
Typical day loads 400m rows
Historical Databases
8/8/2019 Nasdaq 2
7/297
Staging Database MDSSOracle
Staging
MDSS is the staging database
Comprises of Daily schemas (Mon-Fri) of market data
5 schemas, approx 170 tables, grown from 25 tables
No indexes
Load data using Direct-Path 4x faster than conventional
As volume increased, updated during the day in batches
Processes data after market shutdown at 8pm
8/8/2019 Nasdaq 2
8/29
8
Interim Solution
sourcessources
...
MDSP
Oracle Database10g Release 2
Permanent
SQLServer
Use publish/subscribe
Feed to SQL Server
Push data to Unix system Load into Oracle
During day & market shutdown
Copy to permanent database
MDSS
Oracle
Staging
External customers, e.g. NASD
8/8/2019 Nasdaq 2
9/29
8/8/2019 Nasdaq 2
10/29
10
Copy Performance MDSS to MDSP
...
MDSP
Oracle Database10g Release 2
Permanent
MDSS
Oracle
Staging
Quote and BBO tables in thePermanent Database
Originally Daily Partitions
New Solution in MDSP
5 Daily Range/List partitions 5 extracts using insert /*append */
to load simultaneously into thesubpartitions
Loading 70m records Save 2-3 hrs load time
8/8/2019 Nasdaq 2
11/29
11
Partitioning in MDSP
The Permanent Database
Holding data back to 1995 for non-monthly data Small amount around 10%
Partitioning scheme
Daily or Monthly depending on volume Range and Range/List partitioning
Across 53 tables
12123 user partitions
1490 sub-partitions on 2 tables
Placed in monthly tablespaces
Tablespace set to read-only when all data for that month is
loaded
8/8/2019 Nasdaq 2
12/29
12
Compression Reduced Costs
Compression in MDSP enabled on Tablespaces
Tables
Indexes
Scripts used direct-path to take advantage of
compression Inserts do take a little longer
Compression Ratio 2:1 to 4:1
Query response time not impacted by compression
8/8/2019 Nasdaq 2
13/29
13
Database Implementation without ILM
Active
Less
Active Historical Archive
Data Lifecycle
DIGITAL DATA STORAGE
High Performance
Storage Tier
Tape
Archive
8/8/2019 Nasdaq 2
14/29
14
Match Lifecycle to Storage to Optimise Cost
Active LessActive
HistoricalOffline
Archive
Data Lifecycle
High Performance
Storage Tier
Low Cost
Storage Tier
Online Archive
Storage Tier
DIGITAL DATA STORAGE
Offline
Archive
8/8/2019 Nasdaq 2
15/29
15
Reduce Costs using 4 Tiers of Storage
Tier 1 (Enterprise Class)
Mission critical applications
Highest SLA 99.999%
Typically fibre channel
15k RPM versions
Use fewer smaller drives to achieve highest cache-to-cachedisk throughput to prevent I/O degradation
8/8/2019 Nasdaq 2
16/29
16
Tier 2 Mid Range
Tier 2 (Mid-Range)
Best balance of price performance & reliability
SLA is 99.99%
Capable to support wide variety of hosts, applications and
workloads
Fibre-channel Typically 146gb and 300gb 10k RPM drives
8/8/2019 Nasdaq 2
17/29
17
Tier 3 Online Archive
Tier 3 (Online Archive)
Best price by storage capacity
Does not have the reliability or performance of Tier 2
SLA 99.9%
Large SATA disks
Infrequent access Used for
online data archive
disk backups
8/8/2019 Nasdaq 2
18/29
18
Tier 4 Offline Archive
DIGITAL DATA STORAGE
Tier 4 (Offline)
Archive to Tape
Used only for Backups
8/8/2019 Nasdaq 2
19/29
19
Storage Tier Costs
$1.5Tier 4
$7Tier 3
$14Tier 2
$72Tier 1
Cost per GBStorage Tier
Unit cost charged back to the business unit
Costs for Tiers 2 & 3 reduced due to Oracle compression
8/8/2019 Nasdaq 2
20/29
20
How is Tiered Storage Used
MDSS (Staging Database)
uses only Tier 2
MDSP (Permanent Database)
Tier 2
Most Recent 3 months data (4 to 5 TB compressed)
Archive log
Tier 3
All read only data older than 3 months (20 TB)
8/8/2019 Nasdaq 2
21/29
21
Storage Cost Savings for MDSP
$215 040$1 511 420Total Cost
$1 296 380Saving
$1 474 560
$368 640
Tier 1 Only
$143 360Data over 3 months old (20TB)
$71 680Archive & 3 Months Data (5 TB)
Tier 2 & 3MDSP
In the beginning, everything was on Tier 1
Costs for Tiers 2 & 3 reduced due to Oracle compression
8/8/2019 Nasdaq 2
22/29
22
Moving Data Across Storage Tiers
Keep 3 months worth of data on Tier 2
Roll over a month at a time to Tier 3 Use UNIX volume manager
Mirror Tier 2 data to Tier 3
Usually completes overnight Split mirror the next day
Return Tier 2 disks to the available disk
pool
Tier 2
Tier 3
8/8/2019 Nasdaq 2
23/29
23
Historical Data
Originally all in MDSP
Single point of Failure
Created 34 Historical Databases
MDSP has views to monthly tables held in historical
databases Contains data from 1995 thru to 2005
Only started using Partitioning in 2002 for new systems
Before 2002 using daily & monthly tables 2000 2005 have quarterly databases
Around 7TB for historical data
8/8/2019 Nasdaq 2
24/29
24
Consolidating the Historical Data
Up to 2005 separate historical databases In 2006 all data is kept in MDSP
Why the Change
Use Partitioning Compression
Reduced Storage Costs due to Tiered Storage
Easier to Manage
A hi i
8/8/2019 Nasdaq 2
25/29
25
Archiving
Investigated many solutions
Found it to be costly and complex
Steady decline in disk costs
No need to implement an archive solution
Shutdown any historical database older than 5 years
Bring data online upon request only
C t lli A t D t
8/8/2019 Nasdaq 2
26/29
26
Controlling Access to Data
Roles
determine tables have access to
Views Limit data that can be viewed
Every quarter review databasesfor who has access
A diti
8/8/2019 Nasdaq 2
27/29
27
Auditing
Use a product called AMBEO
Captures the packets going between database and
end-users
Extracts users and tables being access
Stored information in another Oracle database
C l i
8/8/2019 Nasdaq 2
28/29
28
Conclusion
Partitioning has been a key factor in enabling Nasdaq
to meet its SLAs
Been able to make significant cost reductions and
save resources and reduce databases using
Tiered Storage
Compression
Implement all this without requiring any changes tothe application
8/8/2019 Nasdaq 2
29/29
29