www.DanHotka.com
Analytical SQL Tips and Techniques for Oracle10g
Dan HotkaAuthor/Speaker/Oracle Expert
www.DanHotka.com
Dan is a Training Consultant
! 2-day Hands-on Workshops– SQL Statement Tuning Tips and Techniques– Advanced SQL Tuning Tips and Techniques– Oracle/Unix Scripting Techniques– Business Intelligence Courses:
• Discoverer Desktop/Discoverer Admin• Intro to Oracle for Business Intelligence Users• Advanced Oracle for Business Intelligence Users
! Oracle Courses (using Trubix Course Materials)– Check my website for a current list of offerings
! 1-day Seminars: “A Close Look at Oracle”– Oracle Block and Index Internals– Variety of Tuning Tips– Oracle New Features
! Register for my quarterly Newsletter
www.DanHotka.com
Agenda
! Introduction to Analytical SQL! How it works
– Sample Database
– Syntax
! Rank Functions! Partitioning Functions ! Percentiles Function! Physical and logical offset Functions
www.DanHotka.com
Intro to Analytical SQL
! Introduced in Oracle9i! Used with GROUP BY! New Features:
– Composit3 columns– Concatenated groupings– New Functions:
• Grouping_ID• Group_ID
www.DanHotka.com
Intro to Analytical SQL
! Utilizes intermediate result sets (inline views)! This ppt features RANK and Partition
– Useful for a variety of reports including cross-tabular! Analytical SQL is applied to the result set
– After the WHERE clause, BEFORE the ORDER BY clause! Multiple Rows per group (differs from Aggregate
functions)! Creates an additional column in the result set
www.DanHotka.com
Intro to Analytical SQL
SQL> get query11 select sales_person, sum(sales_amt),2 RANK() OVER (order by sum(sales_amt) desc) as RANK3* from sales group by sales_person
SQL> /
SALES_PERS SUM(SALES_AMT) RANK ---------- -------------- ----------Matt 142500 1 Jeff 37000 2 Greg 27000 3 Kevin 10000 4 Stan 10000 4
www.DanHotka.com
Intro to Analytical SQL
SQL> get query21 select sales_person, sum(sales_amt),2 RANK() OVER (order by sum(sales_amt) desc) as RANK_ID3 from sales 4 where RANK_ID < 45* group by sales_person
SQL> /where RANK_ID < 4
*ERROR at line 4:ORA-00904: invalid column name
www.DanHotka.com
Intro to Analytical SQL
SQL> get query31 select * from 2 select sales_person, sum(sales_amt),3 RANK()OVER(order by sum(sales_amt) desc) as RANK_ID4 from sales 5 group by sales_person)6* where RANK_ID < 4
SQL> /
SALES_PERS SUM(SALES_AMT) RANK_ID ---------- -------------- ----------Greg 142500 1 Jeff 37000 2 Matt 27000 3
www.DanHotka.com
How it Works
! Basic Syntax– <ANALYTICAL FUNCTION>(<parameters>) OVER(<clause>)– Parentheses are required, even if no parameters are required.
! The OVER syntax identifies the function as an analytic function.! The OVER syntax can have as many as 3 subclasses but are not required.
! The 3 subclauses are:! 1. Partitioning sub clause, splits the result set into groups! 2. Ordering sub clause, performs sorting as required by the function
being used! 3. Windowing sub clause, defines the sliding window which is a result
set relative to the row currently being processed
www.DanHotka.com
How it works
The Database
My sample database consists of 2 tables and a total of 36 rows.
SQL> desc customersName Type------------------------------------ ----------------------------CUSTOMER_ID NUMBERCUSTOMER_NAME VARCHAR2(10)CUSTOMER_CITY VARCHAR2(10)
SQL> desc salesName Type----------------------------------- ----------------------------CUSTOMER_ID NUMBERSALES_PERSON VARCHAR2(10)SALES_AMT NUMBER(6)SALES_DATE DATE
www.DanHotka.com
Analytical SQL Categories
! Ranking – Top and bottom of category by region
! Windowing– Moving average of sales/items/etc
! Reporting Aggregates– After SQL is processed: # rows, Sum, Min, Max by group by
! Lag/Lead– Comparing one item (like year) to a previous/next item on same line
! Statistics– Linear and covariance functions
www.DanHotka.com
Rank Functions
SQL> get query4.sql1 select * from2 (select customer_name, sum(sales_amt),3 RANK() OVER (order by sum(sales_amt) desc) as RANK_ID4 from sales, customers5 where sales.customer_id = customers.customer_id6 group by customer_name)7* order by RANK_ID
SQL> /
CUSTOMER_N SUM(SALES_AMT) RANK_ID ---------- -------------- ----------Dan 105000 1 May 33000 2 Janet 30000 3 Tim 30000 3 Steve 18000 5 John 10500 6
Highest sale, Lowest Ranking
www.DanHotka.com
Rank Functions
SQL> get query51 select * from2 (select customer_name, sum(sales_amt),3 RANK() OVER (order by sum(sales_amt) desc) as RANK_ID4 from sales, customers5 where sales.customer_id = customers.customer_id6 group by customer_name)7* where RANK() OVER (order by sum(sales_amt)) < 5
SQL> /where RANK() OVER (order by sum(sales_amt)) < 5
*ERROR at line 7:ORA-30483: window functions are not allowed here
Only want top 5: Analytical only in SELECT & FROM clauses
www.DanHotka.com
Rank Functions
SQL> get query61 select * from2 (select customer_name, sum(sales_amt),3 RANK() OVER (order by sum(sales_amt) desc) as RANK_ID4 from sales, customers5 where sales.customer_id = customers.customer_id6 group by customer_name)7* where RANK_ID < 5
SQL> /
CUSTOMER_N SUM(SALES_AMT) RANK_ID ---------- -------------- ----------Dan 105000 1 Marlene 33000 2 Janet 30000 3 Tim 30000 3
www.DanHotka.com
Partitioning Example
! Partitioning allows grouping based on current row
! Allows for MIN & MAX per sales person, – Not the whole table!
www.DanHotka.com
Partitioning Example
SQL> get query71 select distinct sales_person, 2 sum(sales_amt) OVER(PARTITION by sales_person) as "Total Sales", 3 min(sales_amt) OVER(PARTITION by sales_person) as Cheapest,4 max(sales_amt) OVER(PARTITION by sales_person) as Expensive5* from sales
SQL> /
SALES_PERS Total Sales CHEAPEST EXPENSIVE ---------- ----------- ---------- ----------Greg 142500 2500 15000 Jeff 37000 1000 11000 Kevin 10000 10000 10000 Matt 27000 1000 11000 Stan 10000 10000 10000
www.DanHotka.com
Partitioning Example
SQL>start query8set pagesize 20compute sum of sales_amt on sales_person;compute sum of AVG_SALE on sales_person;compute sum of DIFFERENCE on sales_person;break on sales_person skip 2select sales_person, sales_amt,
round(avg(sales_amt) OVER()) as AVG_SALE,sales_amt - round(avg(sales_amt) OVER()) as Difference
from salesorder by sales_person/
SALES_PERS SALES_AMT AVG_SALE DIFFERENCE ---------- ---------- ---------- ----------Greg 10000 7550 2450
15000 7550 7450 10000 7550 2450
********** ---------- ---------- ----------sum 142500 113250 29250
www.DanHotka.com
Percentiles FunctionPercentiles FunctionPercentiles FunctionPercentiles Function
! Percentiles can be:– Aggregate function using Group By– Analytical function
! Percentile is a fractional value of the total– 50% is the median– 25% is lower quartile– 75% is upper quartile
! Percentile_Cont – returns value between 2 closest values
! Percentile_Disc– Returns the closest of the existing values
www.DanHotka.com
Percentiles FunctionPercentiles FunctionPercentiles FunctionPercentiles Function
! Aggregate FunctionSQL> get query9
1 select PERCENTILE_CONT(0.5)WITHIN GROUP(ORDER BY sales_amt)AS P_CONT2 ,PERCENTILE_DISC(0.5)WITHIN GROUP(ORDER BY sales_amt)AS P_DISC3* from sales
SQL> /
P_CONT P_DISC ---------- ----------
10000 10000
www.DanHotka.com
Percentiles FunctionPercentiles FunctionPercentiles FunctionPercentiles Function
! Analytical FunctionSQL> get query10
1 select sales_person2 ,PERCENTILE_CONT(0.5) WITHIN GROUP(ORDER BY sales_amt) AS P_CONT3 ,PERCENTILE_DISC(0.5) WITHIN GROUP(ORDER BY sales_amt) AS P_DISC4 from sales5* group by sales_person
SQL> /
SALES_PERS P_CONT P_DISC ---------- ---------- ----------Greg 10000 10000 Jeff 1000 1000 Kevin 10000 10000
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
! Sliding window relative to current row– Can be a physical offset using ROWS
• X number of rows on either side of the current row– Can be a logical offset using RANGE
• X number of different values on either side of the current row– Based on the current row and a number of rows or values
on either side
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
SQL> get query111 select sales_person, sales_amt,2 sum(sales_amt) OVER(ORDER BY sales_amt3 ROWS BETWEEN 5 PRECEDING and 5 FOLLOWING) as SLIDING_SALES4* from sales
SQL> /
SALES_PERS SALES_AMT SLIDING_SALES ---------- ---------- -------------Jeff 1000 18500 Jeff 1000 23500 Greg 2500 27500 Greg 2500 36500 Jeff 5000 63500 Jeff 5000 72500 Greg 10000 80000
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
SQL> get query121 select sales_person, sales_amt,2 sum(sales_amt) OVER(ORDER BY sales_amt3 RANGE BETWEEN 5 PRECEDING and 5 FOLLOWING) as SLIDING_SALES4* from sales
SQL> /
SALES_PERS SALES_AMT SLIDING_SALES ---------- ---------- -------------Greg 2500 7500 Greg 2500 7500 Greg 2500 7500 Matt 5000 15000 Jeff 5000 15000 Jeff 5000 15000 Greg 10000 120000
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
! The next 2 queries combines Physical and Logical offset in a single query– Unbounded Preceding is from the beginning of the result
set to the current row– Unbound Following is from the current row to the end of
the result set
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
SQL> get query141 select sales_person, sales_amt,2 sum(sales_amt) OVER(ORDER BY sales_amt3 ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) as
ROWS_SALES,4 sum(sales_amt) OVER(ORDER BY sales_amt5 RANGE BETWEEN UNBOUNDED PRECEDING and CURRENT ROW) as
RANGE_SALES6* from sales
SQL> /
SALES_PERS SALES_AMT ROWS_SALES RANGE_SALES ---------- ---------- ---------- -----------Matt 1000 1000 6000 Jeff 1000 2000 6000 Jeff 1000 3000 6000 Jeff 1000 4000 6000 Jeff 1000 6000 6000 Greg 2500 13500 13500
www.DanHotka.com
Physical and logical offset Physical and logical offset Physical and logical offset Physical and logical offset FunctionsFunctionsFunctionsFunctions
Date Compare: 2 Month Sliding Window
SQL> get query151 select sales_person, sales_date, sales_amt,2 ROUND(AVG(sales_amt) OVER(ORDER BY sales_date3 RANGE BETWEEN INTERVAL '2' MONTH PRECEDING4 AND INTERVAL '2' MONTH FOLLOWING)) as MOVING_AVG5* from sales6 /
SALES_PERS SALES_DAT SALES_AMT MOVING_AVG ---------- --------- ---------- ----------Stan 01-FEB-02 10000 6900 Greg 01-FEB-02 10000 6900 Greg 01-FEB-02 10000 6900 Jeff 01-MAR-02 11000 7417 Jeff 01-MAR-02 11000 7417 Greg 01-APR-02 2500 7071 Jeff 01-APR-02 1000 7071
www.DanHotka.com
Statistical Functions
! Calculate Linear Regression– SLOPE - slope of determination of the regression line– INTERCEPT - intercept of determination of the regression
line– REGR_R2 - coefficient of determination of the regression
line– REGR_COUNT - number of items– REGR_AVGX - average salary– REGR_AVGY - average bonus
www.DanHotka.com
Statistical Functions
SQL> select year, Month, sum(sales), sum(profit),2 REGR_SLOPE(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Slope,
3 REGR_INTERCEPT(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Intercept,
4 REGR_R2(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Coefficient,
5 REGR_COUNT(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Count,
6 REGR_AVGX(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Average,
7 REGR_AVGY(Profit SUM,Sales SUM) OVER(ORDER BY Profit SUM) Average 2
8* from sales9 /
www.DanHotka.com
Additional Analytical Functions
! CORR *! COVAR_POP *! COVAR_SAMP *! CUME_DIST! DENSE_RANK! FIRST! FIRST_VALUE *! LEAD & LAG! LAST! LAST_VALUE *
! NTILE! RATIO_TO_REPORT! REGR_ (linear! regression) functions *! STDDEV *! STDDEV_POP *! STDDEV_SAMP *! VAR_POP *! VAR_SAMP *! VARIANCE *
www.DanHotka.com
Compare Across Time
SQL> select year, Month, sum(sales), LAG(sales SUM, 1) OVER(ORDER BY Year)
2* from sales group by year3 /
Compare Current Year to Previous Year
Year SALES SUM Previous YEAR
---------- --------- ----------
2001 6900
2002 12000 6900
2003 14500 12000
www.DanHotka.com
Compare Across Time
SQL> select year, Month, sum(sales), LEAD(sales SUM, 1) OVER(ORDER BY Year)
2* from sales group by year3 /
Compare Current Year with Next Year
Year SALES SUM Next YEAR
---------- --------- ----------
2001 6900 12000
2002 12000 14500
2003 14500
www.DanHotka.com
More Information
! Technet.oracle.com! www. Ixora .com.au! Evdbt.com – www. Sagelogix .com! www.jlcomp.demon.co.uk! www.tusc.com! Presentation code downloads
– www. DanHotka.com click on Downloads
www.DanHotka.com
Summary
! Data analysis has become easier– Combining speed of development and power of SQL
! Analytical SQL– Procedural Steps incorporated into SQL
! More information:– Oracle9i SQL Reference Manual
• Chapter 6
! Oracle SQL Reference Manual– Find ‘Analytical’
! Oracle Discoverer Desktop Users Guide– Appendix A
www.DanHotka.com
What have we learned?
! Introduction to Analytical SQL! How it works! Rank Functions! Partitioning Functions ! Percentiles Function! Physical and logical offset Functions
www.DanHotka.com
Dan is a Training Consultant
! 2-day Hands-on Workshops– SQL Statement Tuning Tips and Techniques– Advanced SQL Tuning Tips and Techniques– Oracle/Unix Scripting Techniques– Business Intelligence Courses:
• Discoverer Desktop/Discoverer Admin• Intro to Oracle for Business Intelligence Users• Advanced Oracle for Business Intelligence Users
! Oracle Courses (using Trubix Course Materials)– Check my website for a current list of offerings
! 1-day Seminars: “A Close Look at Oracle”– Oracle Block and Index Internals– Variety of Tuning Tips– Oracle New Features
! Register for my quarterly Newsletter