+ All Categories
Home > Documents > Oracle9i for data warehousing

Oracle9i for data warehousing

Date post: 27-Jan-2015
Category:
Upload: databaseguys
View: 117 times
Download: 0 times
Share this document with a friend
Description:
 
Popular Tags:
55
Transcript
Page 1: Oracle9i for data warehousing
Page 2: Oracle9i for data warehousing

Oracle9ifor Data Warehousing

John AbrahamsTechnology Sales ConsultantOracle Nederland

Page 3: Oracle9i for data warehousing

The Old Way:Fragmented Information Supply Chain

DataIntegration

Engine

OLAPEngine

MiningEngine

• Protracted implementation and maintenance cycle– Synchronization and currency issues– Information Management chaos

Data Warehouse

Engine

Page 4: Oracle9i for data warehousing

The New Way: Oracle9i

• Single business intelligence platform– Reduce administration and implementation costs– Faster deployment– Improved scalability and reliability

Data Warehousing

ETL

OLAP

Data Mining

Oracle9i

Page 5: Oracle9i for data warehousing

Oracle9iComplete, Therefore Simple

Page 6: Oracle9i for data warehousing

Oracle9i DatabaseSingle business-intelligence data server

Relational

OLAP

Data Mining

ETL

Metadata

Page 7: Oracle9i for data warehousing

Oracle9i Application ServerRuns All Your Business Intelligence Applications

Portal

Query & Reporting

BI Components

Web Site Analysis

Metadata

Page 8: Oracle9i for data warehousing

Oracle Database for Data Warehousing

Continuous InnovationOracle 7.3Oracle 7.3

Hash Join Bitmap Indexes Parallel-Aware Optimizer Partition Views Instance Affinity: Function

Shipping Parallel Union All Asynchronous Read-Ahead Histograms Anti-Join

Partitioned Tables and Indexes Partition Pruning Parallel Index Scans Parallel Insert, Update, Delete Parallel Bitmap Star Query Parallel ANALYZE Parallel Constraint Enabling Server Managed Backup/Recovery Point-in-Time Recovery

Oracle 8.0Oracle 8.0

Oracle8iOracle8i

Summary Management New Partitioning Schemes Resource Manager Progress Monitor Adaptive Parallel Query Server-based Analytic Functions Transportable Tablespaces Direct Loader API Functional Indexes Partition-wise Joins Security Enhancements

and more ...

Page 9: Oracle9i for data warehousing

New Oracle9i RDBMS featuresExtending Oracle’s leadership

• Automatic Memory Tuning • ETL Infrastructure

– Change data capture– External tables– Table functions– Upserts– Multi-table INSERTs– Resumable statements– Transportable tablespace

enhancements• List Partitioning• Internal enhancements for:

– parallel query– aggregation– cost-based optimization

• Bitmap Join Indexes • Analytic SQL fns

– Grouping sets– FIRST/LAST aggregates– Inverse distribution– Hypothetical rank

• Proactive query governing• Enhancements to MVs

– Broader refresh and rewrite capabilities

– More sophisticated summary advisor

• Full Outer Joins• WITH-clause

Page 10: Oracle9i for data warehousing

Real Application Clusters(RAC)

Exploiting clustered systems

Shared Cache Shared Cache ArchitectureArchitectureOne databaseOne database

Avoiding application downtime by single node failure

Allows applications to become

– Highly scaleable– Highly available

Page 11: Oracle9i for data warehousing

Cache Fusion means Scalability

Protocol that allows instances to combine their data caches into a shared global cache

– Global Cache Service (GCS) coordinates sharing Key features are

– Direct sharing of volatile buffer caches– Efficient inter-node messaging framework– Fast recovery from node failures using cache and

CPU resources from all surviving nodes

DataDataA-ZA-Z

Shared Cache

Page 12: Oracle9i for data warehousing

Manage Large Volumes of Data

Page 13: Oracle9i for data warehousing

Managing Large Volumes of Data

• Partitioning and parallelism are crucial for VLDB

• Parallelism for all operations– DBA operations: loading, index-creation, table-creation,

data-modification, backup and recovery– End-user operations: Queries– Unbounded scalability: Real Application Clusters

• Partitioning provides ‘incremental’ operations for:– Data loading– Indexing– Referential Integrity– Backup and recovery

Page 14: Oracle9i for data warehousing

How Parallel Execution Works

• With serial execution only one process is used

• With parallel execution

– One parllel execution coordinator process

– Many parallel execution servers

– Table is dynamically partitioned into granules

Page 15: Oracle9i for data warehousing

VLDB Manageability and Performance Constraints

• Table availability:

– Large tables are more vulnerable to disk failure

– It is too costly to have a large table inaccessible for hours due to recovery

• Large table manageability

– They take to long to be loaded

– Indexes take too long to be built

– Partial deletes take hours

• Performance considerations

– Large table and index scans are costly

– Scanning a subset improves performance

Page 16: Oracle9i for data warehousing

Benefits of Partioning

• Availability

– Partions can be independently managed

– Backup and restore operations can be done on individual partitions

– Partitions that are unavailable do not affect queries on DML operations on other paritions that use the same table or index

• Manageability

– A partition can be moved from one tablespace to another

– A partition can be dropped, truncated, added

– A partition can be divided at user-defined value

Page 17: Oracle9i for data warehousing

Benefits of Partioning (2)

Performance

– The optimizer eliminates partitions that not have to be scanned

– Partitions can be scanned in parallel

– Partitions can be load-balanced across physical devices

– Join operations can be optimized to “join by the partition”

Page 18: Oracle9i for data warehousing

Partitioning MethodsPartitioning Methods

• Range Partitioning

• Hash partitioning

• List Partioning

• Composite Partioning

• Range Partitioning

• Hash partitioning

• List Partioning

• Composite Partioning

Rangepartitioning

Hashpartitioning

Composite partitioning

List Partioning

Page 19: Oracle9i for data warehousing

Partitioned Indexes

• Indexes can be partitioned like tables

• Partitioned or nonpartioned indexes can be used with partitioned or nonpartitioned tables

• Partioned indexes can be

– Global or local

– Prefixed or nonprefixed

Page 20: Oracle9i for data warehousing

... 01-Mar01-Feb96-May96-Apr

Partitioned Tables with Local Indexes:

96-Jun

Rolling Window Operations

Page 21: Oracle9i for data warehousing

... 01-Mar01-Feb96-May96-Apr

Partitioned Tables with Local Indexes:1. Load and index new month

96-Jun 01-Apr

Rolling Window Operations

Page 22: Oracle9i for data warehousing

... 01-Mar01-Feb96-May96-Apr

Partitioned Tables with Local Indexes:1. Load and index new month2. Add new month to table

96-Jun 01-Apr

Rolling Window Operations

Page 23: Oracle9i for data warehousing

... 01-Mar01-Feb96-May96-Apr

Partitioned Tables with Local Indexes:1. Load and index new month2. Add new month to table3. Remove old month from table

96-Jun 01-Apr

Rolling Window Operations

Page 24: Oracle9i for data warehousing

... 01-Mar01-Feb96-May

Partitioned Tables with Local Indexes:•New data has been loaded with virtually no disruption•Powerful methodology for managing time-based updates to the Warehouse

96-Jun 01-Apr

Rolling Window Operations

Page 25: Oracle9i for data warehousing

1200 GMT

List Partitioning

Same benefits as rolling window: data is partitioned according to business requirements

Maintenance

Online Queries

EuropeRegionEuropeRegion

AmericasRegion

AmericasRegion

AsiaRegion

AsiaRegion

Page 26: Oracle9i for data warehousing

1200 GMT2000 GMT

List Partitioning

Same benefits as rolling window: data is partitioned according to business requirements

Maintenance

Online Queries

EuropeRegionEuropeRegion

AmericasRegion

AmericasRegion

AsiaRegion

AsiaRegion

Page 27: Oracle9i for data warehousing

1200 GMT2000 GMT0400 GMT

List Partitioning

Same benefits as rolling window: data is partitioned according to business requirements

EuropeRegionEuropeRegion

AmericasRegion

AmericasRegion

AsiaRegion

AsiaRegion

Maintenance

Online Queries

Page 28: Oracle9i for data warehousing

Table Compression: What is it?

• Tables can be compressed

– Compression can also be specified at the partition level

– Indexes and index-organized tables are not compressed

• Typical compression ratios are 3:1 - 5:1

– Compression is dependent upon the actual data

– Compression algorithm based on removing data redundancy

• All DDL/DML commands are supported on compressed tables

Page 29: Oracle9i for data warehousing

Table Compression:What isn’t it?

• This is not a generic ‘zip’-style compression

– Not all tables will have good compression

– Compression algorithm guarantees that compression will never increase size of table

– Most large DW tables seem to compress well

• Compression happens between column/row values, not within column/row values

– Long character strings are not compressed unless the exact same string appear multiple times

• LOB/BLOB columns are not compressed

Page 30: Oracle9i for data warehousing

Table Compression:How it works

<rowid> ‘650-506-7000’ ‘650-123-4567’

<rowid> ‘650-506-7000’ ‘650-506-7001’

<rowid> ‘650-506-7000’ ‘650-456-7890’

<rowid> ‘650-506-7000’ ‘650-098-7654’

<rowid> ‘650-506-7000’ ‘650-123-4567’

<rowid> ‘650-506-7001’ ‘650-123-4567’

<rowid> ‘650-506-7001’ ‘650-123-4567’

<symbol table: <A>= ‘650-506-7000’, <B>=‘650-506-7001’, <C>=‘650-123-4567’>

<rowid> <A> ‘650-123-4567’<rowid> <A> <B><rowid> <A> ‘650-456-7890’<rowid> <A> ‘650-098-7654’<rowid> <A> <C><rowid> <B> <C><rowid> <B> <C>…

Duplicate values are stored in symbol table for each block

Uncompressed Compressed

Page 31: Oracle9i for data warehousing

Table Compression:Usage

• Creating compressed tables:

– CREATE TABLE T1(id integer) COMPRESS;

• Converting tables to compressed tables:

– ALTER TABLE T3 MOVE COMPRESS;

• Creating compressed tablespaces:

– CREATE TABLESPACE tabspace_2

DATAFILE 'diska:tabspace_file2.dat' SIZE 20M

DEFAULT COMPRESS STORAGE ( … );

Page 32: Oracle9i for data warehousing

Table Compression: Performance Impact

• Queries on compressed tables may observe minor performance degradation

– Performance impact depends upon the query

– Many queries will be faster

– Compression reduces IO but increases CPU utilization

– For a set of heterogeneous queries, performance should degrade by no more than 5%

• Load and direct-path INSERT performance will be slower

– Data must be compressed as it is added to the table

Page 33: Oracle9i for data warehousing

Table Compression: When to Use it

• Data warehouses containing large volumes of historical data

– Compress all of the older data in a data warehouse

– Integrate compression into the ‘rolling window’ paradigm

– For example, most recent 3 months of data could be stored uncompressed and the previous 21 months could be stored compressed

• Materialized views and other derived data sets

• Generally, compression should be applied to data that is infrequently updated

Page 34: Oracle9i for data warehousing

Manage large numbers of concurrent users

Page 35: Oracle9i for data warehousing

Manage large numbers of users

• Key requirements:

– Guarantee optimal resource utilization all the time

– Provide the appropriate amount of resources to every job or query based on priority and system load

– Pro-actively prevent ‘runaway’ queries

– Pro-actively prevent system overloading

• Managing large numbers of users should be simple and automated

Page 36: Oracle9i for data warehousing

Appropriate Resources to Each Query

• CPU

– Business-critical processes receive more CPU

– Database Resource Manager allows DBA to assign CPU resources to groups of users

• Memory

– Oracle9i dynamically allocates runtime memory based on current available memory and each query’s requirements

• Parallelism

– Degree of parallelism is dynamically chosen based on available resources and each query’s requirements

Page 37: Oracle9i for data warehousing

Automatic Runtime Memory Tuning

• One parameter:PGA_AGGREGATE_SIZE = <size>

• Dynamic allocation of ‘runtime’ memory based upon each query’s requirements

– In data-warehouse environments, >50% of a server’s physical memory is typically used for query ‘runtime’ memory

• Benefits:

– Reduced overall memory usage

– Improved throughput

– Simplified tuning

Page 38: Oracle9i for data warehousing

Pro-active management of DW Workloads

• Predictive Query Governing and Dynamic Re-prioritization:

– Queries which are estimated to take longer than an DBA-specified limit will abort or be ‘de-prioritized’

• Automatic Queuing:

– A limit can be set on the number of active session for each group of users; queries submitted which exceed this limit will be queued

• Via Database Resource Manager

Page 39: Oracle9i for data warehousing

Example Scenario

• Power Users– Up to 70% of the CPU resources– Any degree of parallelism– Any query which is expected to take over one hour will

be migrated to background

• Report Users– Up to 20% of the CPU resources– No parallelism– Limit of 40 concurrent queries– Any query which is expected to take over 20 minutes will

be aborted

• Background Jobs– Up to 10% of the CPU resources– Any degree of parallelism– Limit of 5 concurrent queries

Page 40: Oracle9i for data warehousing

Fast Query Performance

• The best approach for every query– Integrated– Comprehensive

Materialized ViewsMaterialized Views

Access & Join MethodsAccess & Join Methods

Parallel OperationsParallel Operations

PartitioningPartitioning

QueryOptimizer

Page 41: Oracle9i for data warehousing

Bitmap Indexes

• The most common index type in Oracle DW environments

– Bitmap indexes introduced in Oracle 7.3

– Bitmap join indexes introduced in Oracle9i

– Oracle has over a dozen patents for bitmap index technology

• Oracle provides patented compression technique for bitmap indexes

– Bitmap indexes are 3-20x smaller than b-tree indexes

– Less storage yields better query performance and more indexed columns

Page 42: Oracle9i for data warehousing

<Blue, <rowid>, 1000100100010010100>

<Green, <rowid>, 0001010000100100000>

<Red, <rowid>, 0100000011000001001>

<Yellow, <rowid>, 0010001000001000010>

Structure of a bitmap indexStructure of a bitmap index

Separate ‘bitmap’ created for each Separate ‘bitmap’ created for each value of the color columnvalue of the color column

A high-level b-tree structure is A high-level b-tree structure is created so that each bitmap can created so that each bitmap can be locatedbe located

CREATE BITMAP INDEX PROD_COLOR ON PROD(COLOR)CREATE BITMAP INDEX PROD_COLOR ON PROD(COLOR)

Bitmap indexes introductionBitmap indexes introduction

Page 43: Oracle9i for data warehousing

• Columns with Low-to-Medium Cardinality

• ‘Set-based’ manipulation of data

• Especially good for large, complex queries– Orders of magnitude performance improvement

• Fully integrated within Oracle9i– Created and managed similar to other Oracle9i indexes

– Used to accelerate single-table access, joins, and aggregation

– Transparently selected by the query optimizer

• Columns with Low-to-Medium Cardinality

• ‘Set-based’ manipulation of data

• Especially good for large, complex queries– Orders of magnitude performance improvement

• Fully integrated within Oracle9i– Created and managed similar to other Oracle9i indexes

– Used to accelerate single-table access, joins, and aggregation

– Transparently selected by the query optimizer

Bitmap indexes characteristicsBitmap indexes characteristics

Page 44: Oracle9i for data warehousing

CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.state) FROM Sales, CustomerWHERE Sales.cust_id = Customer.cust_id;

Bitmap join indexes

Sales Customer

Page 45: Oracle9i for data warehousing

CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.state) FROM Sales, CustomerWHERE Sales.cust_id = Customer.cust_id;

Bitmap join indexes

Sales Customer

Index key is Customer.State

Sales(Customer.state)

Page 46: Oracle9i for data warehousing

CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.state) FROM Sales, CustomerWHERE Sales.cust_id = Customer.cust_id;

Bitmap join indexes

Sales Customer

Index key is Customer.State

Sales(Customer.state)Sales(Customer.state)

Indexed table is Sales

SELECT SUM(SALES.DOLLAR_AMOUNT FROM Sales, CustomerWHERE Sales.cust_id = Customer.cust_idAND CUSTOMER.STATE = ‘California’;

SELECT SUM(SALES.DOLLAR_AMOUNT FROM Sales, CustomerWHERE Sales.cust_id = Customer.cust_idAND CUSTOMER.STATE = ‘California’;

Page 47: Oracle9i for data warehousing

Materialized ViewsMaterialized Views

• Currently, indexes provide fast path access to specific data

• Materialized views work on the same principle

• A Materialized view is an instantiation of a SQL statement - a view with data storage

• Materialized views can be partitioned, indexed separately

• Used for query rewrite to increase performance

• Rewrites are transparent to applications

• Rewrites do not require any special privileges

• Currently, indexes provide fast path access to specific data

• Materialized views work on the same principle

• A Materialized view is an instantiation of a SQL statement - a view with data storage

• Materialized views can be partitioned, indexed separately

• Used for query rewrite to increase performance

• Rewrites are transparent to applications

• Rewrites do not require any special privileges

CREATE MATERIALIZED VIEW sf_sales

AS SELECT * FROM sales

WHERE city_name = ’SAN FRANCISCO’

CREATE MATERIALIZED VIEW sf_sales

AS SELECT * FROM sales

WHERE city_name = ’SAN FRANCISCO’

SF_Sales

SELECT prod_code FROM sales

WHERE city_name = ’SAN FRANCISCO’

SELECT prod_code FROM sales

WHERE city_name = ’SAN FRANCISCO’

SELECT prod_code FROM sf_salesSELECT prod_code FROM sf_sales

Sales

CREATECREATE

Page 48: Oracle9i for data warehousing

SQL support for analytic calculationsSQL support for analytic calculations

• Why enhance the RDBMS for analytic calculations?

• Benefits

– Performance

– Scalability

– Simpler SQL development

• Why enhance the RDBMS for analytic calculations?

• Benefits

– Performance

– Scalability

– Simpler SQL development

Page 49: Oracle9i for data warehousing

Analytic Functions: ExamplesAnalytic Functions: Examples

• Rank– Top 10 sales-reps in each region

• Moving Window– Today’s stock price minus 200-day moving average

• Period-over-period comparisons– Percentage growth of Jan-99 sales over Jan-98

• Ratio-to-report– January’s sales as a percentage of the entire year’s

• Rank– Top 10 sales-reps in each region

• Moving Window– Today’s stock price minus 200-day moving average

• Period-over-period comparisons– Percentage growth of Jan-99 sales over Jan-98

• Ratio-to-report– January’s sales as a percentage of the entire year’s

Page 50: Oracle9i for data warehousing

Data Warehousing

ETL

OLAP

Data Mining

Oracle9Oracle9ii

Platform for Business Intelligence:Data Warehousing

Foundation of the Business Intelligence PlatformMore dataMore usersFasterSimple managementOracle9i introduces dozens of new DW features

Page 51: Oracle9i for data warehousing

Data Warehousing

ETL

OLAP

Data Mining

Oracle9iOracle9i

Platform for Business Intelligence:ETL

Transformation EngineIntegrated in Oracle9iScalable (parallel)Extensible (Java, PL/SQL)Efficient (no data staging)

Warehouse BuilderExtensible framework for designing and deploying DW’s

Page 52: Oracle9i for data warehousing

Data Warehousing

ETL

OLAP

Data Mining

Oracle9Oracle9ii

Platform for Business Intelligence:OLAP

OLAP Services

Analysis-ready Oracle database

Support for complex, multidimensional queries

Highly scalable

Development platform for Internet-ready analytical applications

Java OLAP API

Business Intelligence Beans and JDeveloper

Page 53: Oracle9i for data warehousing

Oracle9i OLAP Services

Forecasts Models AllocationsConsolidations Scenarios Custom Functions

Oracle Relational Database Oracle Relational Database

OLAP ServicesOLAP Services

MetadataMetadata

Java OLAP APIJava OLAP API

Business Intelligence BeansBusiness Intelligence Beans

Query Query ProcessorProcessor

DataData

Metadata Metadata ProviderProvider

Analytic Workspace Analytic Workspace

SQL SQL GeneratorGenerator

DataData

MultidimensionalMultidimensional

EngineEngine

MetadataMetadata

Metadata Metadata ProviderProvider

Data Warehouse - Query and Reporting

Page 54: Oracle9i for data warehousing

Data Warehousing

ETL

OLAP

Data Mining

Oracle9iOracle9i

Platform for Business Intelligence:Data Mining

Data Mining SuiteScalable, integrated data-

mining capabilities

Oracle PersonalizationReal-time personalization engine for building 1:1 relationships over the web

Page 55: Oracle9i for data warehousing

Recommended