Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | reed-mobbs |
View: | 226 times |
Download: | 1 times |
<Insert Picture Here>
Agenda
• Technology• Monitoring• Information Life-cycle Management (ILM)• Oracle Optimized Warehouse Initiative• Market
Parallel Execution
select c.cust_last_name, sum(s.amount_sold)from customers c, sales swhere c.cust_id = s.cust_idgroup by c.cust_last_name ;
Data on Disk Parallel Servers
scanscan
scanscan
scanscan
aggregateaggregate
Scanners
Coordinator
joinjoin
join join
joinjoin
aggregateaggregate
aggregateaggregate
Joiners Aggregators
Partitioning – Benefits
Large TableDifficult to Manage
PartitionDivide and Conquer
Easier to Manage
Improve Performance
Composite PartitionBetter Performance
More flexibility to match business needs
JAN FEB
JAN FEB
USA
EUROPEORDERSORDERS
ORDERS
Transparent to applications
Partitioning in Oracle Database 11gInterval Partitioning
JAN FEB MAR APR
ORDERS
JANFEB
ORDERS
MAR
JANFEB
INVENTORY
• Partitions are created automatically as data arrives
Partitioning in Oracle Database 11gComplete Composite Partitioning
• Range – range• List – list• List – hash• List – range
JANFEB
>5000
1000-
5000
ORDERS
RANGE-RANGEOrder Date by
Order Value
USA EUROPE
>5000
1000-
5000
ORDERS
LIST-RANGERegion by
Order Value
USA EUROPE
Gold
Silver
ORDERS
LIST-LISTRegion by
Customer Type
Partitioning in Oracle Database 11gReference Partitioning
ORDERS
Line
Items
Pick
Lists
Stock
Holds
Back
Orders
ORDERS
Line
Items
Pick
Lists
Stock
Holds
Back
Orders
ORDERS
Line
Items
Pick
Lists
Stock
Holds
Back
Orders
ORDERS
Line
Items
Pick
Lists
Stock
Holds
Back
Orders
ORDERS
Line
Items
Pick
Lists
Stock
Holds
Back
Orders
PartitionORDERSby Date
JAN
FEB
MAR
APR
• Inherit partitioning strategy
Partitioning in Oracle Database 11gVirtual Column-Based Partitioning
ORDERS
ORDER_ID ORDER_DATE CUSTOMER_ID...---------- ----------- ----------- --9834-US-14 12-JAN-2007 659208300-EU-97 14-FEB-2007 396543886-EU-02 16-JAN-2007 45292566-US-94 19-JAN-2007 153273699-US-63 02-FEB-2007 18733
JANFEB
USA
EUROPEORDERS
• REGION requires no storage• Partition by ORDER_DATE, REGION
REGION AS (SUBSTR(ORDER_ID,6,2))------ US EU EU US US
Compression
• Tables and indexes can be compressed• Can be specified on a per-partition basis• Typical compression ratio 3:1
• Requires more CPU to load data• Decompression hardly costs resources• Compress for all DML operations
• Less data on disk• Requires less time to read
• Completely transparent Up To
3XCompression
SQL Query Result Cache
• Store query results in cache• Repetitive executions can use cached result
• Data Warehouse queries• Long-running, IO-intensive• Expensive computations• Return few rows• Excellent opportunity for SQL Query Result Cache
------------------------------------------------------------------| Id | Operation | Name |------------------------------------------------------------------| 0 | SELECT STATEMENT | || 1 | RESULT CACHE | fz6cm4jbpcwh48wcyk60m7qypu || 2 | SORT GROUP BY ROLLUP | ||* 3 | HASH JOIN | |etc.
SQL Query Result CacheOpportunity
• Retail customer data (~50 GB)• Concurrent users submitting queries randomly
• Executive dashboard with 12 heavy analytical queries• Cache results only at in-line view level• 12 queries run in random, different order – 4 queries cached
• Measure average, total response time for all users
447 s
267 s
186 s
No cache
334 s
201 s
141 s
Cache
25%
25%
24%
Improvement
8
4
2
# Users
Other Performance FeaturesTransparent to Your Application
• Materialized Views• Transparent rewrites of expensive queries
• Including rewrites on remote objects• Incremental automatic refresh
• Bitmap Indexes• Optimal storage• Ideal for star or star look-a-like schemas
• SQL Access Advisor – based on workload• Materialized view advice• Index advice• Partition advice
SQL analytics
Bring Algorithms to the DataNot Data to the Algorithms
• Analytic computations done in the database• SQL Analytics• OLAP• Data Mining• Statistics
• Scalability• Security• Backup & Recovery• Simplicity
OLAP
Data Mining
Statistics
Native Support for Pivot and Unpivot
SALESREP Q1 Q2 Q3 Q4---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310
SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310
Native Support for Pivot and Unpivot
select * from quarterly_salesunpivot include nulls(revenue for quarter in (q1,q2,q3,q4))order by salesrep, quarter ;
QUARTERLY_SALES
SALESREP Q1 Q2 Q3 Q4---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310
SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 260 100 Q4 300 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260 102 Q2 280 102 Q3 265 102 Q4 310
Native Support for Pivot and Unpivot
SALESREP 'Q1' 'Q2' 'Q3' 'Q4'---------- ----- ----- ----- ----- 100 230 240 260 300 101 200 220 250 260 102 260 280 265 310
SALES_BY_QUARTER
SALESREP QU REVENUE---------- -- ---------- 100 Q1 230 100 Q2 240 100 Q3 160 100 Q4 90 100 Q3 100 100 Q4 140 100 Q4 70 101 Q1 200 101 Q2 220 101 Q3 250 101 Q4 260 102 Q1 260
select * from sales_by_quarterpivot (sum(revenue)for quarter in ('Q1','Q2','Q3','Q4'))order by salesrep ;
Transform Data Where Data ResidesIn-database ETL technology
Extract
Change Data Capture
External Tables
SQL*Loader
Data Pump
Transportable Tablespaces
Multi-Table Insert
MERGE
Distributed Queries
Table Functions
Load Transform Insert
Partition Exchange Loading
DML error logging
• Capture changes from [redo | archive] logs• No changes to source applications• Minimal performance impact on source applications
• Store changes in change tables• Provide (bulk) SQL interface to change data
OLTPDB
Logfiles
ChangeData
Log Miner
andStreams
DWTables
SQL, PL/SQL,Java
Transform
Read-consistent subscription
CapturePMOPs
Time-based subscriptionwindows
Asynchronous Change Data Capture
Oracle Database 11g
Automatic Storage Management
• Storage pool for database files• Load-balanced across disks• Capacity on demand
• Add/remove storage on-line• Automatic IO load balancing
• Fault tolerant, high performance• Automatically mirrors and stripes
• Low cost• No IO tuning required• No volume manager or file system needed
Mixed Workloads
• Concurrent small data loads and queries• Looks like... OLTP
• Oracle's read consistency• Readers never block writers• Writers never block readers• Queries are always consistent
and auditable• No deadlocks• Introduced in Oracle V4 (1982)
– major improvements in V6 (1988)
report
update
update
Rollback Segment
BeforeImage
accuratereport
Budget table
Database Resource Manager
• Protect the system pro-actively• Maximum number of concurrent operations• Priority-dependent maximum Degree Of Parallelism (DOP)
High PrioritySales Analysis20 users (DOP 10)
Medium PriorityAd Hoc Reports200 users
(DOP 4)
Low PriorityETL Jobs200 users
(DOP 4)
Oracle Database Security
Marketing
Finance
Sales
Authenticate
Protect data in transit
Authorize
AccessControl
Protect stored data
Audit
Identity Management
Feature Usage for Large-Scale Data Warehouses
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
DB Res Mgr
RMAN
ASM
Read Only
VPD
MV Use
Compression
Parallel Exec
Partitioning
Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006
Partitioning, parallelism, and compression are the foundation for large-scale data warehousing
Information Lifecycle Management
“The policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition.”
Storage Networking Industry Association (SNIA) Data Management Forum
HistoricalData
ActiveData
Less ActiveData
Information Lifecycle Management
Orders
Q1Orders
Q2Orders
Q3Orders
Q4Orders
OlderOrders
ActiveHigh PerformanceStorage Tier
Less ActiveLow CostStorage Tier
HistoricalOnline ArchiveStorage Tier
Traditional Storage ApproachAll data resides on a single storage tier
High Performance Storage Tier = $72 per Gb
All data on active = $972,000!
ActiveActive
Partitioning is the Foundation for ILMPartition data onto appropriate storage tier
Active Less Active Historical
High Performance Storage Tier = $72 per Gb
Low cost Storage Tier= $14 per Gb
Read only Storage Tier= $7 per Gb
Partitioning is the Foundation for ILMMove data onto appropriate storage tier
5% Active 35% Less Active 60% Historical
High Performance Storage Tier = $72 per Gb
Low cost Storage Tier= $14 per Gb
Read only Storage Tier= $7 per Gb
Partitioning is the Foundation for ILMReduce storage costs accordingly
5% Active 35% Less Active 60% Historical
High Performance Storage Tier = $72 per Gb
Low cost Storage Tier= $14 per Gb
Read only Storage Tier= $7 per Gb
$49,800 $67,700 $58,000
Introduce CompressionReduce storage costs across all tiers
5% Active 35% Less Active 60% Historical
$16,600 $22,600 $19,400
Lets use compression factor of 3
$49,800 $67,700 $58,000
Oracle Optimized Warehouse Initiative
Goals for Oracle data warehouse solutions:
• Provide superior system performance• Provide a superior customer experience
Full Range of DW Solution Options
• Database Options
• Management Packs
ReferenceConfiguration
ReferenceConfiguration
• Documented best-practice configurations for data warehousing
• Benefits:
High performance
Simple to scale; modular building blocks
Industry-leading database and hardware
Available today with HP, IBM, Sun, EMC/Dell
• Flexibility for the most demanding data warehouse
• Benefits:
High performance
Unlimited scalability
Completely customizable
Industry-leading database and hardware
CustomCustom
• Database Options
• Management Packs
Flexibility
Pre-configured, Pre-installed, Validated
• Partitioning• RAC
OptimizedWarehouseOptimizedWarehouse
• Scalable systems pre-installed and pre-configured: ready to run out-of-the-box
• Benefits:
High performance
Simple to buy
Fast to implement
Easy to maintain
Competitively priced
Data Warehouse Market
39.8%
22.7%
16.0%
11.4%
10.1%
Oracle
IBM
Microsoft
Teradata
Other
Source: IDC, 2006 - Worldwide Data Warehousing Tools 2005 Vendor Shares
Oracle is the Data Warehousing DBMS Market Leader
Leading ScalabilityWintercorp VLDB Survey
Source: http://www.wintercorp.com
Yahoo! Oracle 100.39AT&T Daytona 93.88 KT-IT Group DB2 49.40AT&T Daytona 26.71 LGR - Cingular Oracle
25.20Amazon.com Oracle
24.77Anonymous DB2 19.65UPSS Microsoft 19.47Amazon.com Oracle
18.56Nielsen Media Sybase IQ
17.69
2005 SurveyFrance Telecom Oracle
29.23AT&T Proprietary
26.27 SBC Teradata 24.81Anonymous DB2 16.19 Amazon.com Oracle
13.00Kmart Teradata 12.59Claria Oracle 12.10HIRA Sybase IQ 11.94FedEx Teradata 9.98Vodafone Gmbh Teradata
9.91
2003 SurveySears Teradata 4.63 HCIA Informix 4.50Wal-Mart Teradata 4.42 Tele Danmark DB2
2.84Citicorp DB2 2.47MCI Informix 1.88NDC Health Oracle 1.85Sprint Teradata 1.30Ford Oracle 1.20Acxiom Oracle 1.13
1998 Survey
Oracle DW 10+TB Customers (3/2006)Various Platforms and Architectures
• Acxiom 16 TB HP• Allstate 15 TB Sun (RAC)• Amazon 61 TB HP (RAC)• Cellcom 14 TB HP• CenturyTel 10 TB HP• Chase 30 TB IBM (RAC)• Choicepoint 14 TB Sun• Claria 38 TB Sun• Experian 14 TB Sun• KTF 14 TB HP• Cingular 25 TB HP
• Mastercard 20 TB IBM (RAC)• NASDAQ 35 TB Sun• NexTel 28 TB HP• NYSE Group 15 TB HP (RAC)• Reliance Ltd 13 TB Sun• Starwood 12 TB HP• TIM (Italy) 12 TB HP (RAC)• Turkcell 14 TB Sun (RAC)• UBS AG 15 TB Sun• UPS 10 TB HP• Yahoo! 130 TB Fujitsu
Hundreds of Terabyte+ DW Customers!
<Insert Picture Here>
Summary
• Technology• Monitoring• Information Life-cycle Management (ILM)• Oracle Optimized Warehouse Initiative• Market