+ All Categories
Home > Documents > Characteristic Functions. Want: YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001e1 198 204 214 231 (from fin_data...

Characteristic Functions. Want: YearCodeQ1AmtQ2AmtQ3AmtQ4Amt 2001e1 198 204 214 231 (from fin_data...

Date post: 14-Dec-2015
Category:
Upload: angel-minney
View: 230 times
Download: 2 times
Share this document with a friend
49
Characteristic Functions
Transcript

Characteristic Functions

Characteristic Functions

Want:

Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231

(from fin_data table in Sybase Sample Database)

Have:

Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231

Characteristic Functions

Want: Un-normalized

Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231

Have: Normalized

Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231

Characteristic Functions

Pivot Table

Year Code Q1Amt Q2Amt Q3Amt Q4Amt2001 e1 198 204 214 231

Similar to Excel Database Format

Year quarter code amount2001 Q1 e1 1982001 Q2 e1 2042001 Q3 e1 2142001 Q4 e1 231

Possible Solutions(without using Characteristic Functions)

Self Join as in type 4 queries

Possible Solution – Self Join

Start with just Q1 and Q2, code E1, Year 2001

Year code Q1Amt Q2Amt

2001 e1 198 204

Use a self join as in type 4 queries

Write the SQL query

Possible Solution – Self Join

Code

SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2

WHERE F1.year = 2001 // Only 2001AND F2.year = 2001

AND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’

AND F1.quarter = ‘Q1’ // Get Q1 amountAND F2.quarter = ‘Q2’ // Get Q2 amount

Year code Q1Amt Q2Amt

2001 e1 198 204

Possible Solution – Self Join

Expand to all years

Year code Q1Amt Q2Amt

1999 e1 101 932000 e1 153 1492001 e1 198 204

Possible Solution – Self Join

Code

SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt FROM fin_data F1, fin_data F2

WHERE F1.year = F2.year // Same Year

AND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’

AND F1.quarter = ‘Q1’ // Get Q1 amountAND F2.quarter = ‘Q2’ // Get Q2 amount

Year code Q1Amt Q2Amt

1999 e1 101 932000 e1 153 1492001 e1 198 204

Expand to all years

Possible Solution – Self Join

Expand to all four quarters

Year code Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231

Possible Solution – Self Join

Code

All four quarters

SELECT F1.year, F1.code, F1.amount AS Q1Amt, F2.amount AS Q2Amt, F3.amount AS Q3Amt, F4.amount AS Q4AmtFROM fin_data F1, fin_data F2, fin_data F3, fin_data F4 WHERE F1.year = F2.year // Same YearAND F2.year = F3.yearAND F3.year = F4.yearAND F1.code = ‘e1’ // Only financial code e1AND F2.code = ‘e1’AND F3.code = ‘e1’AND F4.code = ‘e1’AND F1.quarter = ‘Q1’ // One record for each quarterAND F2.quarter = ‘Q2’AND F3.quarter = ‘Q3’AND F4.quarter = ‘Q4’

Year code Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231

Possible Solution – Self Join

Problems

Coding:Suppose we wanted months instead of quarters…

Performance:Suppose fin_data had 100,000 records instead of 84…

We need a better solution!

Possible Solutions(without using Characteristic Functions)

SubQueries

Possible Solution – SubQueries

Start with just Q1 and Q2, code E1, Year 2001

Year code Q1Amt Q2Amt

2001 e1 198 204

Use a subquery as a field in the select clause

Write the SQL query

Possible Solution – SubQueries

Code

SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = 2001)FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’AND F1.year = 2001

Year code Q1Amt Q2Amt

2001 e1 198 204

Use a subquery as a field in the select clause

Possible Solution – SubQueries

Expand to all years

Year code Q1Amt Q2Amt

1999 e1 101 932000 e1 153 1492001 e1 198 204

Possible Solution – SubQueries

Code

SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year)FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’//AND F1.year = 2001

Year code Q1Amt Q2Amt

1999 e1 101 932000 e1 153 1492001 e1 198 204

Expand to all years

This is now a correlated subquery

Possible Solution – SubQueries

Expand to all four quarters

Year code Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231

Possible Solution – SubQueries

Code

All four quarters

SELECT F1.year, F1.code, F1.amount AS Q1Amt,( SELECT F2.amount as Q2Amt FROM fin_data F2 WHERE F2.quarter = ‘Q2’ AND F2.code = ‘e1’ AND F2.year = F1.year),( SELECT F2.amount as Q3Amt FROM fin_data F2 WHERE F2.quarter = ‘Q3’ AND F2.code = ‘e1’ AND F2.year = F1.year),

Year code Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231

( SELECT F2.amount as Q4Amt FROM fin_data F2 WHERE F2.quarter = ‘Q4’ AND F2.code = ‘e1’ AND F2.year = F1.year)

FROM fin_data F1WHERE F1.quarter = ‘Q1’AND F1.code = ‘e1’

Possible Solution – Self Join

ProblemsCoding:Again, Suppose we wanted months instead of quarters…

Performance:Our example requires the effective execution of 10

queries• The F1 query is run once to return the year, code and

Q1Amt columns• The F2 query is run 9 times return the Q2, Q3, and

Q4 amounts for 1999, 2000 and 2001 If our table had 100,000 rows F2 would run 300,000

times!

We still need a better solution!

Possible Solution – Temporary Table

Strategy

Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt

Insert Query to add records and fill in year, code, Q1Amt

Update Query to add Q2Amt

Update Query to add Q3Amt

Update Query to add Q4Amt

Possible Solution – Temporary Table

Create a Table

Create a table with fields for year, code, Q1Amt, Q2Amt, Q3Amt, and Q4Amt

CREATE TABLE QAmt

(

year char(4),

code char(2),

Q1Amt numeric(9),

Q2Amt numeric(9),

Q3Amt numeric(9),

Q4Amt numeric(9)

);

Possible Solution – Temporary Table

DROP a Table

There will be times when you want to get rid of a table you have created. Perhaps you made an error when you created it. Or, you may simply be through using it. To get rid of a table you DROP it from the database.

DROP TABLE QAmt

Possible Solution – Temporary Table

INSERT Query

Insert Query to add records and fill in year, code, Q1Amt

INSERT INTO QAmt (year, code, Q1Amt)SELECT year, code, amountFROM fin_dataWHERE quarter = 'Q1'AND code = 'e1'

Possible Solution – Temporary Table

Delete Query

There may also be times when you want to keep a table, but get rid of all the records in the table… to empty it. If, for example, you find a problem with your INSERT query, you may want to empty the table before you run the corrected INSERT query. To empty a table you DELETE all the records FROM the table. To empty the QAmt table you:

DELETE FROM QAmt

Possible Solution – Temporary Table

After INSERT Query

Check the results:

SELECT *

FROM QAmt

Year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 (NULL) (NULL) (NULL)2000 e1 153 (NULL) (NULL) (NULL)2001 e1 198 (NULL) (NULL) (NULL)

Possible Solution – Temporary Table

Update Query

Year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 (NULL) (NULL)2000 e1 153 149 (NULL) (NULL)2001 e1 198 204 (NULL) (NULL)

Update Query to add Q2Amt

UPDATE QAmt Q

SET Q2Amt =

( SELECT amount

FROM fin_data F

WHERE F.year = Q.year

AND F.code = Q.code

AND F.quarter = 'Q2‘ )

We do two more update queries

to fill in Q3Amt and Q4Amt

Possible Solution – Temporary Table

Is this really different At this point you may have noticed that the temporary table approach looks a lot like the subquery approach. In fact, it is virtually the same. It seems conceptually simpler because the temporary table allows us break down the problem into separate and distinct subtasks. We can complete part of the solution, check the results, complete some more, etc. For this reason, you will see it used. It has some legitimate applications, but this isn’t one of them.

This point becomes even more evident if we use either the self-join approach or the subquery approach to populate the table in one step.

Possible Solution – Temporary Table

One Step Using Self Join INSERT INTO QAmt

(year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)

SELECT F1.year, F1.code,

F1.amount,

F2.amount,

F3.amount,

F4.amount

FROM

fin_data F1, fin_data F2,

fin_data F3, fin_data F4

WHERE F1.year = F2.year

AND F2.year = F3.year

AND F3.year = F4.year

AND F1.code = 'e1'

AND F2.code = 'e1'

AND F3.code = 'e1'

AND F4.code = 'e1'

AND F1.quarter = 'Q1'

AND F2.quarter = 'Q2'

AND F3.quarter = 'Q3'

AND F4.quarter = 'Q4'

Possible Solution – Temporary Table

One Step Using SubQuery INSERT INTO QAmt

(year, code, Q1Amt, Q2Amt, Q3Amt, Q4Amt)

SELECT year, code, amount,

(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q2' ),

(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q3' ),

(SELECT amount FROM fin_data F WHERE F.year = Q.year AND F.code = Q.code AND F.quarter = 'Q4' )

FROM fin_data QWHERE quarter = 'Q1'AND code = 'e1'

Possible Solution – Temporary Table

Problems

We still need a better solution!

So… using a temporary table really isn’t a distinct approach at all. It can make the solution conceptually simpler by allowing us to divide the solution into discrete steps, but it does nothing to reduce overall coding complexity or to improve performance.

In addition, it introduces potential problems with regard to updates made between the time the table is created and the time it is used (currency). If you really do want a snapshot of the data at a particular time then this can be useful. In general, it’s undesirable

You can overcome the currency problem by creating a view instead of a table but, so far, you’re still basically using either a self-join or a subquery method. If you want to create a View you’re better off using Characteristic Functions to do it!

Better Solution

Characteristic

Functions

Characteristic Functions

Strategy

Distribute amount to one of four new columns:

year code quarter amount Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 Q1 101 101 0 0 0

1999 e1 Q2 93 0 93 0 0

1999 e1 Q3 129 0 0 129 0

1999 e1 Q4 145 0 0 0 145

2000 e1 Q1 153 153 0 0 0

2000 e1 Q2 149 0 149 0 0

2000 e1 Q3 157 0 0 157 0

2000 e1 Q4 163 0 0 0 163

etc.

Characteristic Functions

Strategy

and GROUP BY year, summing the amounts in the new columns for each year.

year fin_code Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 101 93 129 145

etc.

year code quarter amount Q1Amt Q2Amt Q3Amt Q4Amt

1999 e1 Q1 101 101 0 0 0

1999 e1 Q2 93 0 93 0 0

1999 e1 Q3 129 0 0 129 0

1999 e1 Q4 145 0 0 0 145

etc.

Characteristic Functions

WHERE can be Bad

In the previous examples, the reason we needed four copies of the table or four different queries is that the determination of which rows to include was made in the WHERE clause.

WHERE quarter = ‘Q1’ allows us to put only the amounts for Q1 in the Q1Amt column. That’s good. But… it also removes the possibility of filling in the Q2Amt column in the same query. So Q2Amt must be filled in using a separate copy of the table or a separate subquery with a WHERE clause that allows us to see the amounts in the rows WHERE quarter = ‘Q2’

Characteristic Functions

Characteristic Functions allow us to control which rows get represented in a particular column without using a WHERE clause.

With Characteristic Functions the determination of whether amount is represented in Q1Amt or Q2Amt or Q3Amt or Q4Amt is made in the SELECT clause.

Characteristic Functions

1 or 0

A Characteristic Functions is an expression that evaluates to:• 1 if a field should be represented in a particular column• 0 if it should not.

For example, a characteristic function to determine whether an amount should be represented in Q1Amt would return 1 if quarter is ‘Q1’, 0 if it isn’t. We might call this particular characteristic function CF1.

Characteristic Functions

Assume for a moment that we have created CF1. We could then use it to control whether amount is represented in Q1Amt by defining Q1Amt as CF1 * amount.

SELECT year, code, quarter, amount, CF1, CF1 * amount AS Q1AmtFROM fin_dataWHERE code = ‘e1’

Would yield:

year code quarter amount CF1 Q1Amt1999 e1 Q1 101 1 1011999 e1 Q2 93 0 01999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 1 1532000 e1 Q2 149 0 0etc.

Note: this obviously won’t run yet since we haven’t actually created CF1 at this point.

Characteristic Functions

We then create CF2 to return 1 if quarter contains ‘Q2’, 0 otherwise. We use CF2 in the same SELECT clause to control whether amount is represented in Q2Amt

SELECT year, code, quarter, amount,CF1, CF1 * amount AS Q1Amt,CF2, CF2 * amount AS Q2Amt

FROM fin_dataWHERE code = ‘e1’

Would yield:

year code quarter amount CF1 Q1Amt CF2 Q2Amt1999 e1 Q1 101 1 101 0 01999 e1 Q2 93 0 0 1 931999 e1 Q3 129 0 0 0 01999 e1 Q4 145 0 0 0 02000 e1 Q1 153 1 153 0 02000 e1 Q2 149 0 0 1 149etc.

Characteristic Functions

We then GROUP BY year and SUM the Q1Amt and Q2Amt for the year

SELECT year, MAX(code),SUM (CF1 * amount) AS Q1Amt,SUM (CF2 * amount AS Q2Amt

FROM fin_dataWHERE code = ‘e1’GROUP BY year

Would yield:

year code Q1Amt Q2Amt1999 e1 101 932000 e1 153 1492001 e1 198 204

Characteristic Functions

Implementing CF1

Recall that CF1 evaluates to:• 1 if quarter is ‘Q1’, • 0 otherwise.

Both Sybase and SQL Server contain a CASE statement that can be used to implement CF1 as

(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END)

Characteristic Functions

CF1 and CF2

SELECT year, code, quarter, amount,(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) as CF1,(CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) as CF2

FROM fin_dataWHERE code = ‘e1’

Yields:

Year code quarter amount CF1 CF21999 e1 Q1 101 1 01999 e1 Q2 93 0 11999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 1 02000 e1 Q2 149 0 1

Characteristic Functions

Using CF1 & CF2

SELECT year, code, quarter, amount,(CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount AS Q1Amt,(CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount AS Q2Amt

FROM fin_dataWHERE code = ‘e1’

Yields:

Year code quarter amount Q1Amt Q2Amt1999 e1 Q1 101 101 01999 e1 Q2 93 0 931999 e1 Q3 129 0 01999 e1 Q4 145 0 02000 e1 Q1 153 153 02000 e1 Q2 149 0 149

Characteristic Functions

Group By YearSELECT year, MAX(code) AS fin_code,SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt,SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2AmtFROM fin_dataWHERE code = ‘e1’GROUP BY year

Yields:

year fin_code Q1Amt Q2Amt1999 e1 101 932000 e1 153 1492001 e1 198 204

Characteristic FunctionsAll 4 quarters

SELECT year, MAX(code) AS fin_code, SUM((CASE WHEN quarter = ‘Q1’ THEN 1 ELSE 0 END) * amount) AS Q1Amt, SUM((CASE WHEN quarter = ‘Q2’ THEN 1 ELSE 0 END) * amount) AS Q2Amt, SUM((CASE WHEN quarter = ‘Q3’ THEN 1 ELSE 0 END) * amount) AS Q3Amt, SUM((CASE WHEN quarter = ‘Q4’ THEN 1 ELSE 0 END) * amount) AS Q4AmtFROM fin_dataWHERE code = ‘e1’GROUP BY year

Yields:

year fin_code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1452000 e1 153 149 157 1632001 e1 198 204 214 231

Characteristic Functions

Can you modify the query to show all codes, not just e1?

year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 10431999 e5 381 402 412 4671999 r1 1023 2033 2998 30141999 r2 234 459 601 9442000 e1 153 149 157 1632000 e2 643 687 898 923etc.

Characteristic Functions

Can you modify the query to show all codes, not just e1?

year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 1043etc.

SELECT year, code,SUM((CASE WHEN quarter = 'Q1' THEN 1 ELSE 0 END) * amount) AS Q1Amt,SUM((CASE WHEN quarter = 'Q2' THEN 1 ELSE 0 END) * amount) AS Q2Amt,SUM((CASE WHEN quarter = 'Q3' THEN 1 ELSE 0 END) * amount) AS Q3Amt,SUM((CASE WHEN quarter = 'Q4' THEN 1 ELSE 0 END) * amount) AS Q4Amt

FROM fin_dataGROUP BY year, codeORDER BY year, code

Characteristic Functions

Result:

We create a “Pivot” with only a single pass through the table

and

without resorting to programming!

year code Q1Amt Q2Amt Q3Amt Q4Amt1999 e1 101 93 129 1451999 e2 403 459 609 6321999 e3 1437 2033 2184 21451999 e4 623 784 856 10431999 e5 381 402 412 4671999 r1 1023 2033 2998 30141999 r2 234 459 601 944etc.

Characteristic Functions

Add Slide

Timing for Join

Timing for SubQuery

Timing for Temporary Table

Timing for Characteristic Function


Recommended