10.10 Grouping with WITH ROLLUP
The GROUP BY clause has many features to group data and calculate aggregated data, such as the total number of penalties or the sum of all penalties. However, all statements return results in which all data is on the same level of aggregation. But what if we want to see data belonging to different aggregation levels within one statement? Imagine that with one statement we want to see the total penalty amount for each player, followed by the total penalty amount for all players. The forms of the GROUP BY clauses discussed so far do not make this possible. To achieve the desired result, more than two groupings within one GROUP BY clause are required. By adding the specification WITH ROLLUP to the GROUP BY clause, it becomes possible.
Example 10.23: For each player, find the sum of all his or her penalties, plus the sum of all penalties.
Use the UNION operator as a way to combine these two groupings into one statement.
SELECT PLAYERNO, SUM(AMOUNT)
FROM PENALTIES
GROUP BY PLAYERNO
UNION
SELECT NULL, SUM(AMOUNT)
FROM PENALTIES
The result is:
PLAYERNO SUM(AMOUNT)
-------- -----------
6 100.00
8 25.00
27 175.00
44 130.00
104 50.00
? 480.00
Explanation: The rows of this intermediate result in which the PLAYERNO column is filled form the result of the first select block. The rows in which PLAYERNO is equal to null make up the result of the second select block. The first five rows contain data on the aggregation level of the player numbers, and the last row contains data on the aggregation level of all rows.
The specification WITH ROLLUP has been introduced to simplify this kind of statement. WITH ROLLUP can be used to ask for multiple groupings with one GROUP BY clause. Using this approach, the previous statement would then be
SELECT PLAYERNO, SUM(AMOUNT)
FROM PENALTIES
GROUP BY PLAYERNO WITH ROLLUP
Explanation: The result of this statement is the same as the previous one. The specification WITH ROLLUP indicates that after the result has been grouped on [PLAYERNO], another grouping is neededin this case, on all rows.
To further define this concept, imagine that the expressions E1, E2, E3, and E4 are specified in a GROUP BY clause. The grouping performed is [E1, E2, E3, E4]. When we add the specification WITH ROLLUP to this GROUP BY clause, an entire set of groupings is performed: [E1, E2, E3, E4], [E1, E2, E3], [E1, E2], [E1], and finally []. The specification [] means that all rows are grouped into one group. The specified grouping is seen as the highest aggregation level that is asked and also indicates that all higher aggregation levels must be calculated again. To aggregate upward is called rollup. So the result of this statement contains data on five different levels of aggregation.
If an expression occurs in the SELECT clause in which the result of a certain grouping is not grouped, the null value is placed in the result.
Example 10.24: For each combination of sex-town, get the number of players, total number of players per sex, and total number of players in the entire table.
SELECT SEX, TOWN, COUNT(*)
FROM PLAYERS
GROUP BY SEX, TOWN WITH ROLLUP
The result is:
SEX TOWN COUNT(*)
--- --------- --------
M Stratford 7
M Inglewood 1
M Douglas 1
M ? 9
F Midhurst 1
F Inglewood 1
F Plymouth 1
F Eltham 2
F ? 5
? ? 14
Explanation: This result has three levels of aggregation. Rows 1, 2, 3, 5, 6, 7, and 8 form the lowest level and have been added because of the grouping [SEX, TOWN]; rows 4 and 9 have been added because of the grouping [SEX]; and the last row forms the highest level of aggregation and has been added because of the grouping []. It contains the total number of players.
Exercise 10.20: For each team, get the number of matches played and also the total number of matches.
Exercise 10.21: Group the matches by the name of the player and division of the team, and execute a ROLLUP. Then for each group, get the name of the player, division of the team, and total number of sets won.
10.11 Answers
10.1
SELECT JOINED
FROM PLAYERS
GROUP BY JOINED
10.2
SELECT JOINED, COUNT(*)
FROM PLAYERS
GROUP BY JOINED
10.3
SELECT PLAYERNO, AVG(AMOUNT), COUNT(*)
FROM PENALTIES
GROUP BY PLAYERNO
10.4
SELECT TEAMNO, COUNT(*), SUM(WON)
FROM MATCHES
WHERE TEAMNO IN
(SELECT TEAMNO
FROM TEAMS
WHERE DIVISION = 'first')
GROUP BY TEAMNO
10.5
SELECT WON, LOST, COUNT(*)
FROM MATCHES
WHERE WON > LOST
GROUP BY WON, LOST
ORDER BY WON, LOST
10.6
SELECT P.TOWN, T.DIVISION, SUM(WON)
FROM (MATCHES AS M INNER JOIN PLAYERS AS P
ON M.PLAYERNO = P.PLAYERNO)
INNER JOIN TEAMS AS T
ON M.TEAMNO = T.TEAMNO
ROUP BY P.TOWN, T.DIVISION
ORDER BY P.TOWN
10.7
SELECT NAME, INITIALS, COUNT(*)
FROM PLAYERS AS P INNER JOIN PENALTIES AS PEN
ON P.PLAYERNO = PEN.PLAYERNO
WHERE P.TOWN = 'Inglewood'
GROUP BY P.PLAYERNO, NAME, INITIALS
10.8
SELECT T.TEAMNO, DIVISION, SUM(WON)
FROM TEAMS AS T, MATCHES AS M
WHERE T.TEAMNO = M.TEAMNO
GROUP BY T.TEAMNO, DIVISION
10.9
SELECT LENGTH(RTRIM(NAME)), COUNT(*)
FROM PLAYERS
GROUP BY LENGTH(RTRIM(NAME))
10.10
SELECT ABS(WON - LOST), COUNT(*)
FROM MATCHES
GROUP BY ABS(WON LOST)
10.11
SELECT YEAR(BEGIN_DATE), MONTH(BEGIN_DATE), COUNT(*)
FROM COMMITTEE_MEMBERS
GROUP BY YEAR(BEGIN_DATE), MONTH(BEGIN_DATE)
ORDER BY YEAR(BEGIN_DATE), MONTH(BEGIN_DATE)
10.12
Although this column appears in the SELECT clause, the result of the DIVISION column has not been grouped.
The NAME column cannot appear like this in the SELECT clause because the result has not been grouped on the full NAME column.
The PLAYERNO column appears in the SELECT clause, although the result has not been grouped; furthermore, the column does not appear as a parameter of an aggregation function.
10.13
Superfluous
Not superfluous
Superfluous
10.14
SELECT AVG(NUMBERS)
FROM (SELECT COUNT(*) AS NUMBERS
FROM PLAYERS
GROUP BY TOWN) AS TOWNS
10.15
SELECT TEAMS.TEAMNO, DIVISION, NUMBER_PLAYERS
FROM TEAMS LEFT OUTER JOIN
(SELECT TEAMNO, COUNT(*) AS NUMBER_PLAYERS
FROM MATCHES
GROUP BY TEAMNO) AS M
ON (TEAMS.TEAMNO = M.TEAMNO)
10.16
SELECT PLAYERS.PLAYERNO, NAME, SUM_AMOUNT,
NUMBER_TEAMS
FROM (PLAYERS LEFT OUTER JOIN
(SELECT PLAYERNO, SUM(AMOUNT) AS SUM_AMOUNT
FROM PENALTIES
GROUP BY PLAYERNO) AS TOTALS
ON (PLAYERS.PLAYERNO = TOTALS.PLAYERNO))
LEFT OUTER JOIN
(SELECT PLAYERNO, COUNT(*) AS NUMBER_TEAMS
FROM TEAMS
WHERE DIVISION = 'first'
GROUP BY PLAYERNO) AS NUMBERS
ON (PLAYERS.PLAYERNO = NUMBERS.PLAYERNO)
10.17
SELECT TEAMNO, COUNT(DISTINCT PLAYERNO)
FROM MATCHES
WHERE TEAMNO IN
(SELECT TEAMNO
FROM PLAYERS AS P INNER JOIN TEAMS AS T
ON P.PLAYERNO = T.PLAYERNO
AND TOWN = 'Stratford')
AND WON > LOST
GROUP BY TEAMNO
10.18
SELECT PLAYERNO, NAME, JOINED - AVERAGE
FROM PLAYERS,
(SELECT AVG(JOINED) AS AVERAGE
FROM PLAYERS) AS T
10.19
SELECT PLAYERNO, NAME, JOINED AVERAGE
FROM PLAYERS,
(SELECT TOWN, AVG(JOINED) AS AVERAGE
FROM PLAYERS
GROUP BY TOWN) AS TOWNS
WHERE PLAYERS.TOWN = TOWNS.TOWN
10.20
SELECT TEAMNO, COUNT(*)
FROM MATCHES
GROUP BY TEAMNO WITH ROLLUP
10.21
SELECT P.NAME, T.DIVISION, SUM(WON)
FROM (MATCHES AS M INNER JOIN PLAYERS AS P
ON M.PLAYERNO = P.PLAYERNO)
INNER JOIN TEAMS AS T
ON M.TEAMNO = T.TEAMNO
GROUP BY P.NAME, T.DIVISION WITH ROLLUP
SQL for MySQL Developers: A Comprehensive Tutorial and Reference
By Rick F. van der Lans
Published by Addison Wesley Professional
ISBN-10: 0-13-149735-9
Published: Apr 20, 2007
Dimensions 7x9-1/4
Pages: 1032
Buy this book