+ All Categories
Home > Documents > Nasdaq 2

Nasdaq 2

Date post: 10-Apr-2018
Category:
Upload: dumbrava-caius-florin
View: 215 times
Download: 0 times
Share this document with a friend

of 29

Transcript
  • 8/8/2019 Nasdaq 2

    1/29

    1

  • 8/8/2019 Nasdaq 2

    2/29

    NASDAQ: Partitioning at Work

    Cost-Efficient Data Management Through Database

    Consolidation and ILM

    Dr Lilian Hobbs - ILM Product Manager

  • 8/8/2019 Nasdaq 2

    3/29

    3

    NASDAQ the Company

    Largest electronic screen-based equity security market

    in the US More than 3,300 listed companies

    Technology, financial services, retail, communications,

    transportation, media, and biotech

    On average, highest share trading in US

    On a typical day, NASDAQ processes

    22 million quotes

    9 million trades

    45 million orders

  • 8/8/2019 Nasdaq 2

    4/294

    NASDAQ Business Objective

    Preserve the market data and provide access in atimely manner

    Stores historical data back to 1995

    Transactional granularity required Adherence to SEC regulations and reporting

    Internal and external clients National Association of Securities (NASD) dealers is biggest

    'external' consumer

    They regulate the Nasdaq Brokers/dealers

    Investors

  • 8/8/2019 Nasdaq 2

    5/29

  • 8/8/2019 Nasdaq 2

    6/296

    The Databases

    MDSS

    Staging Database

    This is not partitioned

    MDSP

    Permanent Database

    Monthly & Non-Monthly Tables

    Partitioned

    Typical day loads 400m rows

    Historical Databases

  • 8/8/2019 Nasdaq 2

    7/297

    Staging Database MDSSOracle

    Staging

    MDSS is the staging database

    Comprises of Daily schemas (Mon-Fri) of market data

    5 schemas, approx 170 tables, grown from 25 tables

    No indexes

    Load data using Direct-Path 4x faster than conventional

    As volume increased, updated during the day in batches

    Processes data after market shutdown at 8pm

  • 8/8/2019 Nasdaq 2

    8/29

    8

    Interim Solution

    sourcessources

    ...

    MDSP

    Oracle Database10g Release 2

    Permanent

    SQLServer

    Use publish/subscribe

    Feed to SQL Server

    Push data to Unix system Load into Oracle

    During day & market shutdown

    Copy to permanent database

    MDSS

    Oracle

    Staging

    External customers, e.g. NASD

  • 8/8/2019 Nasdaq 2

    9/29

  • 8/8/2019 Nasdaq 2

    10/29

    10

    Copy Performance MDSS to MDSP

    ...

    MDSP

    Oracle Database10g Release 2

    Permanent

    MDSS

    Oracle

    Staging

    Quote and BBO tables in thePermanent Database

    Originally Daily Partitions

    New Solution in MDSP

    5 Daily Range/List partitions 5 extracts using insert /*append */

    to load simultaneously into thesubpartitions

    Loading 70m records Save 2-3 hrs load time

  • 8/8/2019 Nasdaq 2

    11/29

    11

    Partitioning in MDSP

    The Permanent Database

    Holding data back to 1995 for non-monthly data Small amount around 10%

    Partitioning scheme

    Daily or Monthly depending on volume Range and Range/List partitioning

    Across 53 tables

    12123 user partitions

    1490 sub-partitions on 2 tables

    Placed in monthly tablespaces

    Tablespace set to read-only when all data for that month is

    loaded

  • 8/8/2019 Nasdaq 2

    12/29

    12

    Compression Reduced Costs

    Compression in MDSP enabled on Tablespaces

    Tables

    Indexes

    Scripts used direct-path to take advantage of

    compression Inserts do take a little longer

    Compression Ratio 2:1 to 4:1

    Query response time not impacted by compression

  • 8/8/2019 Nasdaq 2

    13/29

    13

    Database Implementation without ILM

    Active

    Less

    Active Historical Archive

    Data Lifecycle

    DIGITAL DATA STORAGE

    High Performance

    Storage Tier

    Tape

    Archive

  • 8/8/2019 Nasdaq 2

    14/29

    14

    Match Lifecycle to Storage to Optimise Cost

    Active LessActive

    HistoricalOffline

    Archive

    Data Lifecycle

    High Performance

    Storage Tier

    Low Cost

    Storage Tier

    Online Archive

    Storage Tier

    DIGITAL DATA STORAGE

    Offline

    Archive

  • 8/8/2019 Nasdaq 2

    15/29

    15

    Reduce Costs using 4 Tiers of Storage

    Tier 1 (Enterprise Class)

    Mission critical applications

    Highest SLA 99.999%

    Typically fibre channel

    15k RPM versions

    Use fewer smaller drives to achieve highest cache-to-cachedisk throughput to prevent I/O degradation

  • 8/8/2019 Nasdaq 2

    16/29

    16

    Tier 2 Mid Range

    Tier 2 (Mid-Range)

    Best balance of price performance & reliability

    SLA is 99.99%

    Capable to support wide variety of hosts, applications and

    workloads

    Fibre-channel Typically 146gb and 300gb 10k RPM drives

  • 8/8/2019 Nasdaq 2

    17/29

    17

    Tier 3 Online Archive

    Tier 3 (Online Archive)

    Best price by storage capacity

    Does not have the reliability or performance of Tier 2

    SLA 99.9%

    Large SATA disks

    Infrequent access Used for

    online data archive

    disk backups

  • 8/8/2019 Nasdaq 2

    18/29

    18

    Tier 4 Offline Archive

    DIGITAL DATA STORAGE

    Tier 4 (Offline)

    Archive to Tape

    Used only for Backups

  • 8/8/2019 Nasdaq 2

    19/29

    19

    Storage Tier Costs

    $1.5Tier 4

    $7Tier 3

    $14Tier 2

    $72Tier 1

    Cost per GBStorage Tier

    Unit cost charged back to the business unit

    Costs for Tiers 2 & 3 reduced due to Oracle compression

  • 8/8/2019 Nasdaq 2

    20/29

    20

    How is Tiered Storage Used

    MDSS (Staging Database)

    uses only Tier 2

    MDSP (Permanent Database)

    Tier 2

    Most Recent 3 months data (4 to 5 TB compressed)

    Archive log

    Tier 3

    All read only data older than 3 months (20 TB)

  • 8/8/2019 Nasdaq 2

    21/29

    21

    Storage Cost Savings for MDSP

    $215 040$1 511 420Total Cost

    $1 296 380Saving

    $1 474 560

    $368 640

    Tier 1 Only

    $143 360Data over 3 months old (20TB)

    $71 680Archive & 3 Months Data (5 TB)

    Tier 2 & 3MDSP

    In the beginning, everything was on Tier 1

    Costs for Tiers 2 & 3 reduced due to Oracle compression

  • 8/8/2019 Nasdaq 2

    22/29

    22

    Moving Data Across Storage Tiers

    Keep 3 months worth of data on Tier 2

    Roll over a month at a time to Tier 3 Use UNIX volume manager

    Mirror Tier 2 data to Tier 3

    Usually completes overnight Split mirror the next day

    Return Tier 2 disks to the available disk

    pool

    Tier 2

    Tier 3

  • 8/8/2019 Nasdaq 2

    23/29

    23

    Historical Data

    Originally all in MDSP

    Single point of Failure

    Created 34 Historical Databases

    MDSP has views to monthly tables held in historical

    databases Contains data from 1995 thru to 2005

    Only started using Partitioning in 2002 for new systems

    Before 2002 using daily & monthly tables 2000 2005 have quarterly databases

    Around 7TB for historical data

  • 8/8/2019 Nasdaq 2

    24/29

    24

    Consolidating the Historical Data

    Up to 2005 separate historical databases In 2006 all data is kept in MDSP

    Why the Change

    Use Partitioning Compression

    Reduced Storage Costs due to Tiered Storage

    Easier to Manage

    A hi i

  • 8/8/2019 Nasdaq 2

    25/29

    25

    Archiving

    Investigated many solutions

    Found it to be costly and complex

    Steady decline in disk costs

    No need to implement an archive solution

    Shutdown any historical database older than 5 years

    Bring data online upon request only

    C t lli A t D t

  • 8/8/2019 Nasdaq 2

    26/29

    26

    Controlling Access to Data

    Roles

    determine tables have access to

    Views Limit data that can be viewed

    Every quarter review databasesfor who has access

    A diti

  • 8/8/2019 Nasdaq 2

    27/29

    27

    Auditing

    Use a product called AMBEO

    Captures the packets going between database and

    end-users

    Extracts users and tables being access

    Stored information in another Oracle database

    C l i

  • 8/8/2019 Nasdaq 2

    28/29

    28

    Conclusion

    Partitioning has been a key factor in enabling Nasdaq

    to meet its SLAs

    Been able to make significant cost reductions and

    save resources and reduce databases using

    Tiered Storage

    Compression

    Implement all this without requiring any changes tothe application

  • 8/8/2019 Nasdaq 2

    29/29

    29


Recommended