Date post: | 31-Mar-2015 |
Category: |
Documents |
Upload: | ximena-ledsome |
View: | 216 times |
Download: | 1 times |
Copyright © 2003, SAS Institute Inc. All rights reserved.SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or Trademarks of their respective companies
Scaling SAS® Data Access toOracle® RDBMS
Howard PlemmonsSAS Institute Inc.Andrew HoldsworthOracle Corporation
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling
What is Scaling?
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling
“To remove the scales of a fish”
“To climb up by means of a scaling ladder”
“To reach the highest point”
Data
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
Why Scale to Data
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
SAS tools, SAS/ACCESS®
SAS Procedure and Processes
Oracle tools
Oracle Procedures and Processes
Copyright © 2003, SAS Institute Inc. All rights reserved.
Intelligence Value Chain
Copyright © 2003, SAS Institute Inc. All rights reserved.
Intelligence Value Chain Silver into Gold
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS System 9
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS V8 vs. SAS System 9
FEATURE SAS V8 SAS System 9
Libname Engine x x
Procedure Interface x x
Fast Load x x
Threaded Interface x
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS V8 I/O Model
Copyright © 2003, SAS Institute Inc. All rights reserved.
Threaded Interface SAS 9
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS Procedures proc sort
proc summary
proc dmine
proc reg; proc dmreg
proc means
proc loess; proc dmdb
proc glm
proc robustreg
Copyright © 2003, SAS Institute Inc. All rights reserved.
SAS/ACCESS® Engines
ORACLE
DB2
Informix
ODBC
Sybase
Teradata
Copyright © 2003, SAS Institute Inc. All rights reserved.
Libname and SAS Procedure Controls
dbslice (“where”,”where”,…)
dbsliceparm (ALL,…)
defaults (THREADED_APPS,2)
options sastrace=‘,,t’;
procedure controls – CPU count
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action - DBSLICEPARM
-dbsliceparm none
option dbsliceparm=
libname x oracle user=scott pass=tiger
dbsliceparm=(threaded_apps,2);
proc print data=y.oratab (dbsliceparm=(all,4)); run;
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action - DBSLICE
libname x oracle user=scott pass=tiger;
proc print data=x.oratab (dbslice= (“where x<100”, “where x >= 100”) );
Copyright © 2003, SAS Institute Inc. All rights reserved.
Options In Action – CPUCOUNT, THREADS
CPUCOUNT=
THREADS | NOTHREADS
Copyright © 2003, SAS Institute Inc. All rights reserved.
Process
Libname controls
Procedure controls
Execution
Copyright © 2003, SAS Institute Inc. All rights reserved.
Linear Scalability
Achieved Speedup
Scalability – SAS 9 Threaded speedup in PROC REG
Run on 12-way Unix Box
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scalability – SAS 9 Threaded speedup in PROC SORT
Run on 8-way Unix BoxTests run in memory cache
Copyright © 2003, SAS Institute Inc. All rights reserved.
What Does This Mean - access
393000 Rows
No Threads - baseline
Two Threads (DBSLICE) – 31%
Six Threads (DBSLICEPARM) – 54%
Run on 10-way Unix BoxTests run in memory cache
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling Data
Data Volumes
Data ACCESS
Data Organization
Scaling using Oracle - Andrew
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling with
The Star Query
Use of Parallelism
Use of the Direct Path
Use of Specialist Indexes
Use of Analytical Functions
Use of Materialized Views
Use of The Oracle9i Optimizer
Copyright © 2003, SAS Institute Inc. All rights reserved.
The Star Query
Fact
Product
Time
Geography
Customer
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Queries The star query is a very common DW
technique. It is highly optimized in Oracle and can be tuned depending on the type of queries. In summary the more known about the query composition the higher level of optimization possible.
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Query Optimization
The Optimization is 3 step Process1.Apply query predicates to dimension tables to generate
lists of foreign keys into the fact table.
2.Query the fact table using series of single column bit mapped indexes on the foreign keys
3.Having resolved the query within the fact table complete the query by joining back to dimension tables where needed and roll the query up.
Copyright © 2003, SAS Institute Inc. All rights reserved.
Star Queries
– To enable star queries the DBA should do the following1. Build single column bitmapped indexes on each
foreign key in the fact table
2. Build indexes on the dimension tables for query predicates
3. Build indexes on the dimension tables to assist in the join back and roll up process
4. Generate statistics for the schema
5. Set the parameter STAR_TRANSFORMATION_ENABLED=TRUE
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of Parallelism
Multiple CPUs to execute a single query as well multiple concurrent queries
Execute Table scans, Index probes and scans in parallel
Execute Joins and Sorts in parallel
Execute DML in parallel
Parallelism can be configured manually or automatically
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of Partitioning
Partitioning was originally designed to allow management of large db objects however by partitioning data performance gains can be made by the following• Partition pruning
• Join optimizations
Partitioning can be done by the following methods• Range e.g. Data or key ranges
• List e.g. Discrete values such as State
• Hash to achieve equal size partitions
Two types of partitioning can be applied
Copyright © 2003, SAS Institute Inc. All rights reserved.
Use of The Direct Path
By pass the conventional transaction layer to insert and copy data within the database
SQL*Loader is user currently by SAS
Other options include• Insert with /*+ append */ hint
• Create Table as Select with NOLOGGING
These constructs can be used to transform vast amounts of data rapidly in parallel
Copyright © 2003, SAS Institute Inc. All rights reserved.
Specialist Indexes
B-Tree Indexes
Bit Mapped Indexes including join indexes
Functional Indexes
Copyright © 2003, SAS Institute Inc. All rights reserved.
Analytical Functions
Oracle has embraced the ANSI OLAP extensions to SQL
These permit faster response times on queries that would require multiple passes of the data with conventional SQL
This allows grouped results and functionality such as moving averages
Copyright © 2003, SAS Institute Inc. All rights reserved.
Materialized Views
Materialized view allow automatic use of summary tables without a user having to re-write the query
Well designed materialized views are small in size and can increase performance by orders of magnitude.
Materialized views are in fact Oracle tables and can use all other features to improve performance
Copyright © 2003, SAS Institute Inc. All rights reserved.
Oracle9i Optimizer
On upgrade of Oracle Releases the Optimizer behavior will change
The Optimizer is tested with over 400,000 SQL Statements
• Where plans change between releases the actual query is ran to test for degradation
• Slower plans are corrected
It is still important to have good representative Statistics
DBMS_STATS package allows parallel generation and migration of schema statistics
Copyright © 2003, SAS Institute Inc. All rights reserved.
Oracle9i Optimizer
Some common Optimizer problems seen with Oracle9i
• Bad or incomplete statistics
• Init.ora parameters influencing optimizer
• SQL written for RBO
Copyright © 2003, SAS Institute Inc. All rights reserved.
Summary
Oracle and SAS provide techniques for scaling to larger databases by optimizing both query performance and fetch performance.
These techniques are simple to adopt and allow huge productivity improvements
We have identified some core technologies here however this is a partial picture of the SAS/Oracle ability.
Copyright © 2003, SAS Institute Inc. All rights reserved.
About the Speakers
Howard Plemmons Andrew HoldsworthSenior Software Manager Director
SAS Institute Inc. Oracle Corp.
SAS Circle 500 Oracle Pkwy,
Cary, NC Redwood Shores, CA94065
Phone:
919-531-7779 650-506-2938
E-mail:
Copyright © 2003, SAS Institute Inc. All rights reserved.
Other SUGI Papers/Presentations
•PC File Data Objects Directly from UNIX – 8:00am Tuesday
•SAS/ACCESS and use of Metadata – Rm 619 @ 2:30
•Lessons in Scalability – SAS Presents – 3:20 Tuesday
•Data Warehousing section - performance
Copyright © 2003, SAS Institute Inc. All rights reserved.
Scaling SAS Data ACCESS to ORACLE RDBMS
Copyright © 2003, SAS Institute Inc. All rights reserved.Copyright © 2003, SAS Institute Inc. All rights reserved. 40