Advanced SQL for Decision Support ISYS 650. Set Operators Union Intersect Difference Cartesian...

Post on 02-Jan-2016

225 views 1 download

transcript

Advanced SQL for Decision Support

ISYS 650

Set Operators

• Union

• Intersect

• Difference

• Cartesian product

Union

• Set1={A, B, C}

• Set2={C, D, E}

• Union: Members in Set 1 or in Set 2– Set1 U Set 2 = {A, B, C, D, E}

Intersect

• Members in Set 1 and in Set 2– Set1 ∩ Set2={C}

Difference• Set1={A, B, C}• Set2={C, D, E}

• Set1 – Set2: Members in Set1 but not in set2 = {A,B}

• Set2 – Set1:Members in Set2 but not in set1 = {D, E}

• Set1-Set2 ≠ Set2 – Set1• Logical operator: NOT

Files as Sets

• Business students’ file: BusSt• Science student’s file: SciSt

– BusSt U SciSt:– BusSt ∩ SciSt– BusSt – SciSt

• Spring 06 Student file: S06St• Fall 06 Student file: F06St

– S06St – F06St– F06St – S06St

Union Compatibility

• Two relations that have the same number of attributes and same type of attributes.

• Union, Intersect and difference operators require the two relations to be union compatible.

Union Compatibility Examples

• File 1: – SID – 9 characters– Sname – 25 characters

• File 2:– SSN – 9 characters– Ename – 25 characters

• File 3:– Ename – 25 characters– EID – 9 characters

• File 1 and file 2 are union compatible; file 1 and file 3 are not; file 2 and file 3 are not.

Product

• Set1 = {a, b, c}

• Set2 = {X, Y, Z}

• Set1 X Set2 = {aX, aY aZ, bX, bY, bZ, cX, cY, cZ}

• Faculty File: • FID Fname• F1 Chao• F2 Smith

• Student File:• SID Sname FID• S1 Peter F1• S2 Paul F2• S3 Smith F1

• Faculty X Student:

SQL Set Operators

• Union compatible

• Union:– (SELECT * FROM table1) UNION (SELECT * FROM table2);

• Intersect:(SELECT * FROM table1) INTERSECT (SELECT *

FROM table2);

• Minus:(SELECT * FROM table1) MINUS (SELECT * FROM

table2);

Find students taking 263 and acct 101• SELECT sid FROM registration• WHERE cid='ISYS263'• INTERSECT• SELECT sid FROM registration• WHERE cid='acct101‘;

• Note 1: This condition is always false:– SELECT sid FROM registration– WHERE cid='ISYS263‘ AND cid='acct101‘;

• Note 2: How to get student name?– SELECT sid,sname FROM student– WHERE sid IN ( ….);

Use IN to do intersect

• SELECT sid,sname FROM student• WHERE sid IN (SELECT sid FROM registration

WHERE cid='ISYS263‘) AND sid IN (SELECT sid FROM registration

WHERE cid='acct101‘);

Use NOT IN to do difference

• Q: Display faculty’s name and phone if the faculty does not advise any student.– SELECT fid, fname FROM faculty– WHERE fid NOT IN (SELECT DISTINCT fid

FROM student);

Cartesian Product

• SELECT fields FROM table1, table2;• SELECT * FROM student, faculty;

• table name alias:• Ex: SELECT aid,sname,s.fid,fname FROM

student s, faculty f;

• Take the product of a table itself.– SELECT * FROM emp e1, emp e2;– SELECT * FROM student s1, student

s2;

CROSS JOIN = Product

• SELECT *

• FROM student CROSS JOIN course;

Access Demo

• How to do product with Access?

• Union?

• Access query wizards:– Find duplicates query.– Find unmatched query

SQL GROUPING SETS

• GROUPING SETS– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS

– GROUP BY GROUPING SETS(CITY,RATING,(CITY,RATING),())

– ORDER BY CITY;

• Note: Compute the subtotals for every member in the GROUPING SETS. () indicates that an overall total is desired.

ResultsCITY Rating COUNT(CID)-------------------- - ---------- ------------------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 A 4 8

CITY R COUNT(CID)-------------------- - ---------- B 3 C 1

SQL CUBE

• Perform aggregations for all possible combinations of columns indicated.– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS

– GROUP BY CUBE(CITY,RATING)

– ORDER BY CITY, RATING;

ResultsCITY Rating COUNT(CID)-------------------- - ------- ----------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 A 4 B 3

CITY R COUNT(CID)-------------------- - ---------- C 1 8

SQL ROLLUP

• The ROLLUP extension causes cumulative subtotals to be calculated for the columns indicated. If multiple columns are indicated, subtotals are performed for each of the columns except the far-right column.– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS– GROUP BY ROLLUP(CITY,RATING)– ORDER BY CITY, RATING;

Results

CITY Rating COUNT(CID)-------------------- - ----------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 8

Views

• A database view is:– a virtual or logical table based on a query.– a stored query.

• CREATE VIEW viewname AS query;– CREATE VIEW femalestudent AS– SELECT * FROM student WHERE sex=‘f’;

• CREATE OR REPLACE VIEW femalestudent AS SELECT * FROM student WHERE sex=‘f’;

READ ONLY Views

• CREATE VIEW viewname AS query– WITH READ ONLY;

• Ex:– CREATE VIEW readEmp– AS (SELECT * FROM emp)– WITH READ ONLY;

ROWNUM Field & Top n Analysis

• ROWNUM field is a pseudocolumn that applies to every table and view.

• Use ROWNUM to do Top n Analysis:– Select students with the best 3 GPA

• Create a view order by GPA, then select from view with rownum <=3.

• Or using InLineView

InLine View• When a multiple-column subquery is

used in the FROM clause of an outer query, it basically creates a temporary table that can be referenced by other clauses of the outer query. The temporary table is called InLine view.

Select from InLine View

SELECT GPAGroup, Count(SID), avg(gpa)FROM (SELECT SID,GPA,

CASE WHEN gpa < 2.0 THEN 'Poor'WHEN gpa < 3.0 THEN 'Good'ELSE 'Excellent'

END AS GPAGroup,FROM student)

GROUPBY GPAGroup;

Note: Calculated field cannot be used in the Where or Group By clause.

Select eid,ename, salary*.1 As Tax From Employee

Where Tax > 1000; ---- This will cause error.

InLine View &Top (Last) n Analysis

• Find students with the top 3 GPA.• Can we do:

– SELECT * FROM student ORDER BY GPA desc WHERE ROWNUM<=3; ? --- No!

• SELECT * FROM • (SELECT * FROM student ORDER BY GPA DESC)• WHERE ROWNUM<=3;

• Note: Use the ROWNUM of the InLineView.

Using RANK() for Top N

select first_name, last_namefrom ( SELECT first_name, last_name, RANK() OVER (ORDER BY salary DESC) sal_rank FROM employees ) WHERE sal_rank <= 10;

RANK calculates the rank of a value in a group of values.

The return type is NUMBER.

Pivot Table

Demo: # of customer by City and Rating

select city, rating, count(*) cntfrom customergroup by city, rating;

select city, max( decode( Rating, 'A', cnt, 0 ) ) A, max( decode( Rating, 'D', cnt, 0 ) ) D, max( decode( Rating, 'Z', cnt, 0 ) ) Z from ( select city, rating, count(*) cnt from customer group by city, rating )group by City;

Indexes

• Field declared as PRIMARY KEY will have an index.

• CREATE INDEX indexname– ON tablename (column names separated by

commas);– Ex:

• CREATE INDEX fkFID• ON student (fid);

• DROP INDEX indexname;