+ All Categories
Home > Documents > Section 5 - Grouping Data

Section 5 - Grouping Data

Date post: 05-Jan-2016
Category:
Upload: gus
View: 45 times
Download: 0 times
Share this document with a friend
Description:
Section 5 - Grouping Data. The GROUP BY clause allows the grouping of data Aggregate functions are most often used with the GROUP BY clause GROUP BY divides a table into sets, then Aggregate functions return summary values for those sets. GROUP BY Syntax. - PowerPoint PPT Presentation
34
1 Section 5 - Grouping Data The GROUP BY clause allows the grouping of data Aggregate functions are most often used with the GROUP BY clause GROUP BY divides a table into sets, then Aggregate functions return summary values for those sets.
Transcript
Page 1: Section 5 - Grouping Data

1

Section 5 - Grouping Data

The GROUP BY clause allows the grouping of data

Aggregate functions are most often used with the GROUP BY clause

GROUP BY divides a table into sets, then Aggregate functions return summary values for those sets.

Page 2: Section 5 - Grouping Data

2

GROUP BY Syntax

SELECT select_listFROM table_list[WHERE conditions]GROUP BY group_by_list;

Page 3: Section 5 - Grouping Data

3

Example

SELECT pub_id, COUNT(title)FROM titlesGROUP BY pub_id;

All items in the Select list that are not in the Group By list must generate a single value for each group

Page 4: Section 5 - Grouping Data

4

Groups within Groups

You may nest Groups within other groups by separating the columns with commas

Example:

SELECT pub_id, type, COUNT(type)FROM titlesGROUP BY pub_id, type;

Page 5: Section 5 - Grouping Data

5

Restrictions

Again: Each item in the SELECT list must produce a single value

Wrong:

SELECT pub_id, type, COUNT(type)FROM titlesGROUP BY pub_id;

Page 6: Section 5 - Grouping Data

6

More Restrictions

You can NOT use expressions in the GROUP BY clause

Wrong:

SELECT pub_id, SUM(price)FROM titlesGROUP BY pub_id, SUM(price);

Page 7: Section 5 - Grouping Data

7

No Column Numbers

Unlike the ORDER BY clause, you cannot use the column select list position number in the GROUP BY clause

Wrong:

SELECT pub_id, SUM(price)FROM titlesGROUP BY 1;

Page 8: Section 5 - Grouping Data

8

Multiple Summaries To see the summary values for a publisher and for the

type of books within that publisher you will need two SELECT statements

SELECT pub_id, SUM(price)FROM titlesGROUP BY pub_id;

SELECT pub_id, type, SUM(price)FROM titlesGROUP BY pub_id, type;

Page 9: Section 5 - Grouping Data

9

Exercise

Display a list of the authors and the state they live in. Sort the list by the author’s last name within state

Page 10: Section 5 - Grouping Data

10

Discussion

SELECT au_lname, au_fname, stateFROM authorsORDER BY state, au_lname;

We don't need a Group By for this statement because no summary information was asked for

Page 11: Section 5 - Grouping Data

11

Exercise

Display a list of states and the number of authors that are from each state. Also, show how many different cities are in each state. Sort in state order.

Page 12: Section 5 - Grouping Data

12

Discussion

SELECT state, count(*), count(distinct city)FROM authorsGROUP BY stateORDER BY state, au_lname;

This gets us the number of authors per state and the number of distinct cities in each state.If we didn't use the DISTINCT keyword we would count all authors who lived in a city.

Page 13: Section 5 - Grouping Data

13

NULLs and GROUPS

NULLs never equal another NULL BUT... GROUP BY will create a separate

group for the NULLs Think of it as a Group of Unknowns

Page 14: Section 5 - Grouping Data

14

Example

The Type column contains NULLs

SELECT type, COUNT(*)FROM titlesGROUP BY type;

Returns count of 1, if we used a COUNT(type) instead of Count(*) we'd get back a zero instead.Why?

Page 15: Section 5 - Grouping Data

15

Discussion

Count(*) counts whole rows and there is 1 row of a NULL type group

Count(type) counts the non-NULL type columns in the NULL type group and there are zero non-NULL values in the NULL group.

Page 16: Section 5 - Grouping Data

16

More NULLs

More than one NULL in a column?

SELECT advance, COUNT(*)FROM titlesGROUP BY advance;

Two books have a NULL advance and they are grouped. [Note: zero is different group]

Page 17: Section 5 - Grouping Data

17

GROUP BY with WHERE The WHERE clause allows grouping of a subset of

rows. The WHERE clause acts first to find the rows you want

Then the GROUP BY clause divides the rows into groups

SELECT type, AVG(price)FROM titlesWHERE advance > 5000GROUP BY type;

Page 18: Section 5 - Grouping Data

18

No WHERE

Same statement, no WHERE

SELECT type, AVG(price)FROM titlesGROUP BY type;

NULL group returned[In the previous example, the WHERE clause eliminated the NULLs]

Page 19: Section 5 - Grouping Data

19

ORDER the GROUPS

GROUP BY puts rows into sets, but doesn't put them in order.

SELECT type, AVG(price)FROM titlesWHERE advance > 5000GROUP BY typeORDER BY 2;

Page 20: Section 5 - Grouping Data

20

Exercise

Show the average position that an author appears on a book if the author has a royalty share less than 100%. Also, show the number of books written by the author. List the author using his social security number and sort by social security number within number of books order. Show the authors with the most number of books first.

Page 21: Section 5 - Grouping Data

21

Discussion

SELECT au_id, AVG(au_ord), COUNT(title_id)

FROM titleauthorsWHERE royaltyshare < 1.0GROUP BY au_idORDER BY 3 DESC, au_id;

Page 22: Section 5 - Grouping Data

22

HAVING Clause

HAVING is like a WHERE clause for a GROUP

WHERE limits rows HAVING limits GROUPs

Page 23: Section 5 - Grouping Data

23

HAVING Syntax

SELECT select_listFROM table_list[WHERE conditions]GROUP BY group_list[HAVING conditions];

Page 24: Section 5 - Grouping Data

24

HAVING Aggregates

The WHERE conditions apply before Aggregates are calculated

Then the HAVING conditions apply after Aggregates are calculated

Page 25: Section 5 - Grouping Data

25

HAVING vs. WHERE

WHERE comes after the FROM HAVING comes after the GROUP BY

WHERE conditions cannot include Aggregates

HAVING conditions almost always include Aggregates

Page 26: Section 5 - Grouping Data

26

Example

SELECT type, count(*)FROM titlesGROUP BY typeHAVING COUNT(*) > 1;

NOTE: Cannot use WHERE instead of HAVING since WHERE does not allow Aggregates

Page 27: Section 5 - Grouping Data

27

HAVING without Aggregates

Applies to grouping columns

SELECT typeFROM titlesGROUP BY typeHAVING type LIKE 'p%';

You could of used the WHERE clause to find types that began with 'p', as well

Page 28: Section 5 - Grouping Data

28

Exercise

List the editor positions that have at least three editors

Page 29: Section 5 - Grouping Data

29

Answer

SELECT ed_pos, count(*)FROM editorsGROUP BY ed_posHAVING count(*) >= 3;

Page 30: Section 5 - Grouping Data

30

HAVING Conditions

You may use more than one condition on a HAVING clause

SELECT pub_id, SUM(advance), AVG(price)FROM titlesGROUP BY pub_idHAVING SUM(advance) > 15000AND AVG(price) < 20AND pub_id > '0800';

Page 31: Section 5 - Grouping Data

31

Exercise

List the publisher id and the average advance for each book that the publisher sells and the total number of books they sell, but only if the total cost of all the books they sell (that are priced more than $10.00) is more than eighty dollars and they sell more than one book. Sort by pub_id and book count.

Page 32: Section 5 - Grouping Data

32

Discussion

SELECT pub_id, AVG(advance),COUNT(*)

FROM titlesWHERE price > 10GROUP BY pub_idHAVING SUM(price) > 80AND Count(*) > 1ORDER BY 1, 3;

Page 33: Section 5 - Grouping Data

33

Discussion

The WHERE clause first eliminates all books that do not cost more than $10

Then the GROUP BY forms the pub_id groups

Then the HAVING clause eliminates any groups whose total cost ( sum(price) ) is not greater than $80 and any pub_id that has not sold more than one book.

Page 34: Section 5 - Grouping Data

34

Section 5 - Last Slide

Please complete Assignment 4


Recommended