Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | angel-minney |
View: | 230 times |
Download: | 2 times |
Characteristic Functions
Want:
Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231
(from fin_data table in Sybase Sample Database)
Have:
Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231
Characteristic Functions
Want: Un-normalized
Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231
Have: Normalized
Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231
Characteristic Functions
Pivot Table
Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231
Similar to Excel Database Format
Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231
Possible Solution – Self Join
Start with just Q1 and Q2, code E1, Year 2001
Year code Q1Amt Q2Amt
2001 e1 198 204
Use a self join as in type 4 queries
Write the SQL query
Possible Solution – Self Join
Code
SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2
WHERE F1.year = 2001 // Only 2001AND F2.year = 2001
AND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’
AND F1.quarter = ‘Q1’ // Get Q1 amountAND F2.quarter = ‘Q2’ // Get Q2 amount
Year code Q1Amt Q2Amt
2001 e1 198 204
Possible Solution – Self Join
Expand to all years
Year code Q1Amt Q2Amt
1999 e1 101 932000 e1 153 1492001 e1 198 204
Possible Solution – Self Join
Code
SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2
WHERE F1.year = F2.year // Same Year
AND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’
AND F1.quarter = ‘Q1’ // Get Q1 amountAND F2.quarter = ‘Q2’ // Get Q2 amount
Year code Q1Amt Q2Amt
1999 e1 101 932000 e1 153 1492001 e1 198 204
Expand to all years
Possible Solution – Self Join
Expand to all four quarters
Year code Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231
Possible Solution – Self Join
Code
All four quarters
SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt, F3.amount AS Q3Amt, F4.amount AS Q4AmtFROM fin_data F1, fin_data F2, fin_data F3, fin_data F4 WHERE F1.year = F2.year // Same YearAND F2.year = F3.yearAND F3.year = F4.yearAND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’AND F3.code = ‘e1’AND F4.code = ‘e1’AND F1.quarter = ‘Q1’ // One record for each quarterAND F2.quarter = ‘Q2’AND F3.quarter = ‘Q3’AND F4.quarter = ‘Q4’
Year code Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231
Possible Solution – Self Join
Problems
Coding:Suppose we wanted months instead of quarters…
Performance:Suppose fin_data had 100,000 records instead of 84…
We need a better solution!
Possible Solution – SubQueries
Start with just Q1 and Q2, code E1, Year 2001
Year code Q1Amt Q2Amt
2001 e1 198 204
Use a subquery as a field in the select clause
Write the SQL query
Possible Solution – SubQueries
Code
SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = 2001)FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’AND F1.year = 2001
Year code Q1Amt Q2Amt
2001 e1 198 204
Use a subquery as a field in the select clause
Possible Solution – SubQueries
Expand to all years
Year code Q1Amt Q2Amt
1999 e1 101 932000 e1 153 1492001 e1 198 204
Possible Solution – SubQueries
Code
SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year)FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’//AND F1.year = 2001
Year code Q1Amt Q2Amt
1999 e1 101 932000 e1 153 1492001 e1 198 204
Expand to all years
This is now a correlated subquery
Possible Solution – SubQueries
Expand to all four quarters
Year code Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231
Possible Solution – SubQueries
Code
All four quarters
SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year),( SELECT F2.amount as Q3Amt FROM fin_data F2 WHERE F2.quarter = ‘Q3’ AND F2.code = ‘e1’ AND F2.year = F1.year),
Year code Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231
( SELECT F2.amount as Q4Amt FROM fin_data F2 WHERE F2.quarter = ‘Q4’ AND F2.code = ‘e1’ AND F2.year = F1.year)
FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’
Possible Solution – Self Join
ProblemsCoding:Again, Suppose we wanted months instead of quarters…
Performance:Our example requires the effective execution of 10
queries• The F1 query is run once to return the year, code and
Q1Amt columns• The F2 query is run 9 times return the Q2, Q3, and
Q4 amounts for 1999, 2000 and 2001 If our table had 100,000 rows F2 would run 300,000
times!
We still need a better solution!
Possible Solution – Temporary Table
Strategy
Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt
Insert Query to add records and fill in year, code, Q1Amt
Update Query to add Q2Amt
Update Query to add Q3Amt
Update Query to add Q4Amt
Possible Solution – Temporary Table
Create a Table
Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt
CREATE TABLE QAmt
(
year char(4),
code char(2),
Q1Amt numeric(9),
Q2Amt numeric(9),
Q3Amt numeric(9),
Q4Amt numeric(9)
);
Possible Solution – Temporary Table
DROP a Table
There will be times when you want to get rid of a table you have created. Perhaps you made an error when you created it. Or, you may simply be through using it. To get rid of a table you DROP it from the database.
DROP TABLE QAmt
Possible Solution – Temporary Table
INSERT Query
Insert Query to add records and fill in year, code, Q1Amt
INSERT INTO QAmt (year, code, Q1Amt)SELECT year, code, amountFROM fin_dataWHERE quarter = 'Q1'AND code = 'e1'
Possible Solution – Temporary Table
Delete Query
There may also be times when you want to keep a table, but get rid of all the records in the table… to empty it. If, for example, you find a problem with your INSERT query, you may want to empty the table before you run the corrected INSERT query. To empty a table you DELETE all the records FROM the table. To empty the QAmt table you:
DELETE FROM QAmt
Possible Solution – Temporary Table
After INSERT Query
Check the results:
SELECT *
FROM QAmt
Year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 (NULL) (NULL) (NULL)2000 e1 153 (NULL) (NULL) (NULL)2001 e1 198 (NULL) (NULL) (NULL)
Possible Solution – Temporary Table
Update Query
Year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 (NULL) (NULL)2000 e1 153 149 (NULL) (NULL)2001 e1 198 204 (NULL) (NULL)
Update Query to add Q2Amt
UPDATE QAmt Q
SET Q2Amt =
( SELECT amount
FROM fin_data F
WHERE F.year = Q.year
AND F.code = Q.code
AND F.quarter = 'Q2‘ )
We do two more update queries
to fill in Q3Amt and Q4Amt
Possible Solution – Temporary Table
Is this really different At this point you may have noticed that the temporary table approach looks a lot like the subquery approach. In fact, it is virtually the same. It seems conceptually simpler because the temporary table allows us break down the problem into separate and distinct subtasks. We can complete part of the solution, check the results, complete some more, etc. For this reason, you will see it used. It has some legitimate applications, but this isn’t one of them.
This point becomes even more evident if we use either the self-join approach or the subquery approach to populate the table in one step.
Possible Solution – Temporary Table
One Step Using Self Join INSERT INTO QAmt
(year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)
SELECT F1.year, F1.code,
F1.amount,
F2.amount,
F3.amount,
F4.amount
FROM
fin_data F1, fin_data F2,
fin_data F3, fin_data F4
WHERE F1.year = F2.year
AND F2.year = F3.year
AND F3.year = F4.year
AND F1.code = 'e1'
AND F2.code = 'e1'
AND F3.code = 'e1'
AND F4.code = 'e1'
AND F1.quarter = 'Q1'
AND F2.quarter = 'Q2'
AND F3.quarter = 'Q3'
AND F4.quarter = 'Q4'
Possible Solution – Temporary Table
One Step Using SubQuery INSERT INTO QAmt
(year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)
SELECT year, code, amount,
(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q2' ),
(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q3' ),
(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q4' )
FROM fin_data QWHERE quarter = 'Q1'AND code = 'e1'
Possible Solution – Temporary Table
Problems
We still need a better solution!
So… using a temporary table really isn’t a distinct approach at all. It can make the solution conceptually simpler by allowing us to divide the solution into discrete steps, but it does nothing to reduce overall coding complexity or to improve performance.
In addition, it introduces potential problems with regard to updates made between the time the table is created and the time it is used (currency). If you really do want a snapshot of the data at a particular time then this can be useful. In general, it’s undesirable
You can overcome the currency problem by creating a view instead of a table but, so far, you’re still basically using either a self-join or a subquery method. If you want to create a View you’re better off using Characteristic Functions to do it!
Characteristic Functions
Strategy
Distribute amount to one of four new columns:
year code quarter amount Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 Q1 101 101 0 0 0
1999 e1 Q2 93 0 93 0 0
1999 e1 Q3 129 0 0 129 0
1999 e1 Q4 145 0 0 0 145
2000 e1 Q1 153 153 0 0 0
2000 e1 Q2 149 0 149 0 0
2000 e1 Q3 157 0 0 157 0
2000 e1 Q4 163 0 0 0 163
etc.
Characteristic Functions
Strategy
and GROUP BY year, summing the amounts in the new columns for each year.
year fin_code Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 101 93 129 145
etc.
year code quarter amount Q1Amt Q2Amt Q3Amt Q4Amt
1999 e1 Q1 101 101 0 0 0
1999 e1 Q2 93 0 93 0 0
1999 e1 Q3 129 0 0 129 0
1999 e1 Q4 145 0 0 0 145
etc.
Characteristic Functions
WHERE can be Bad
In the previous examples, the reason we needed four copies of the table or four different queries is that the determination of which rows to include was made in the WHERE clause.
WHERE quarter = ‘Q1’ allows us to put only the amounts for Q1 in the Q1Amt column. That’s good. But… it also removes the possibility of filling in the Q2Amt column in the same query. So Q2Amt must be filled in using a separate copy of the table or a separate subquery with a WHERE clause that allows us to see the amounts in the rows WHERE quarter = ‘Q2’
Characteristic Functions
Characteristic Functions allow us to control which rows get represented in a particular column without using a WHERE clause.
With Characteristic Functions the determination of whether amount is represented in Q1Amt or Q2Amt or Q3Amt or Q4Amt is made in the SELECT clause.
Characteristic Functions
1 or 0
A Characteristic Functions is an expression that evaluates to:• 1 if a field should be represented in a particular column• 0 if it should not.
For example, a characteristic function to determine whether an amount should be represented in Q1Amt would return 1 if quarter is ‘Q1’, 0 if it isn’t. We might call this particular characteristic function CF1.
Characteristic Functions
Assume for a moment that we have created CF1. We could then use it to control whether amount is represented in Q1Amt by defining Q1Amt as CF1 * amount.
SELECT year, code, quarter, amount, CF1, CF1 * amount AS Q1AmtFROM fin_dataWHERE code = ‘e1’
Would yield:
year code quarter amount CF1 Q1Amt1999 e1 Q1 101 1 1011999 e1 Q2 93 0 01999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 1 1532000 e1 Q2 149 0 0etc.
Note: this obviously won’t run yet since we haven’t actually created CF1 at this point.
Characteristic Functions
We then create CF2 to return 1 if quarter contains ‘Q2’, 0 otherwise. We use CF2 in the same SELECT clause to control whether amount is represented in Q2Amt
SELECT year, code, quarter, amount,CF1, CF1 * amount AS Q1Amt,CF2, CF2 * amount AS Q2Amt
FROM fin_dataWHERE code = ‘e1’
Would yield:
year code quarter amount CF1 Q1Amt CF2 Q2Amt1999 e1 Q1 101 1 101 0 01999 e1 Q2 93 0 0 1 931999 e1 Q3 129 0 0 0 01999 e1 Q4 145 0 0 0 02000 e1 Q1 153 1 153 0 02000 e1 Q2 149 0 0 1 149etc.
Characteristic Functions
We then GROUP BY year and SUM the Q1Amt and Q2Amt for the year
SELECT year, MAX(code),SUM (CF1 * amount) AS Q1Amt,SUM (CF2 * amount AS Q2Amt
FROM fin_dataWHERE code = ‘e1’GROUP BY year
Would yield:
year code Q1Amt Q2Amt1999 e1 101 932000 e1 153 1492001 e1 198 204
Characteristic Functions
Implementing CF1
Recall that CF1 evaluates to:• 1 if quarter is ‘Q1’, • 0 otherwise.
Both Sybase and SQL Server contain a CASE statement that can be used to implement CF1 as
(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END)
Characteristic Functions
CF1 and CF2
SELECT year, code, quarter, amount,(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) as CF1,(CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) as CF2
FROM fin_dataWHERE code = ‘e1’
Yields:
Year code quarter amount CF1 CF21999 e1 Q1 101 1 01999 e1 Q2 93 0 11999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 1 02000 e1 Q2 149 0 1
Characteristic Functions
Using CF1 & CF2
SELECT year, code, quarter, amount,(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount AS Q1Amt,(CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount AS Q2Amt
FROM fin_dataWHERE code = ‘e1’
Yields:
Year code quarter amount Q1Amt Q2Amt1999 e1 Q1 101 101 01999 e1 Q2 93 0 931999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 153 02000 e1 Q2 149 0 149
Characteristic Functions
Group By YearSELECT year, MAX(code) AS fin_code,SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt,SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2AmtFROM fin_dataWHERE code = ‘e1’GROUP BY year
Yields:
year fin_code Q1Amt Q2Amt1999 e1 101 932000 e1 153 1492001 e1 198 204
Characteristic FunctionsAll 4 quarters
SELECT year, MAX(code) AS fin_code, SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt, SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2Amt, SUM((CASE WHEN quarter = ‘Q3’ THEN 1 ELSE 0 END) * amount) AS Q3Amt, SUM((CASE WHEN quarter = ‘Q4’ THEN 1 ELSE 0 END) * amount) AS Q4AmtFROM fin_dataWHERE code = ‘e1’GROUP BY year
Yields:
year fin_code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231
Characteristic Functions
Can you modify the query to show all codes, not just e1?
year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 10431999 e5 381 402 412 4671999 r1 1023 2033 2998 30141999 r2 234 459 601 9442000 e1 153 149 157 1632000 e2 643 687 898 923etc.
Characteristic Functions
Can you modify the query to show all codes, not just e1?
year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 1043etc.
SELECT year, code,SUM((CASE WHEN quarter = 'Q1' THEN 1 ELSE 0 END) * amount) AS Q1Amt,SUM((CASE WHEN quarter = 'Q2' THEN 1 ELSE 0 END) * amount) AS Q2Amt,SUM((CASE WHEN quarter = 'Q3' THEN 1 ELSE 0 END) * amount) AS Q3Amt,SUM((CASE WHEN quarter = 'Q4' THEN 1 ELSE 0 END) * amount) AS Q4Amt
FROM fin_dataGROUP BY year, codeORDER BY year, code
Characteristic Functions
Result:
We create a “Pivot” with only a single pass through the table
and
without resorting to programming!
year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 10431999 e5 381 402 412 4671999 r1 1023 2033 2998 30141999 r2 234 459 601 944etc.