MySQL Multi-Aggregated Rows in Crosstab Queries

MySQL’s crosstabs contain aggregate functions on two or more fields, presented in a tabular format. In a multi-aggregate crosstab query, two different functions can be applied to the same field or the same function can be applied to multiple fields on the same (row or column) axis. Rob Gravelle shows you how to apply two different functions to the same field in order to create grouping levels in the row axis.

Today’s topic of discussion is crosstabs, which contain multiple aggregate functions in the row axis of a tabular resultset. Recall from the the All About the Crosstab Query article that an aggregate function is one that summarizes a group of related data in some way. Examples of aggregate functions include COUNT, SUM, AVG, MIN, and MAX. In a multi-aggregate crosstab query, two different functions can be applied to the same field or the same function can be applied to two or more fields. Today we’ll create a query that applies two different functions to the same field in order to create grouping levels in the row axis.

Recap of Crosstab Basics

In the Tips for Simplifying Crosstab Query Statements article, we took the complex SQL statement that we created in the All About the Crosstab Query article and simplified it to a more streamlined version:

SELECT CASE WHEN Month_Num IS NULL 
            THEN 'TOTAL' 
            ELSE Month 
       END      AS 'Month',
  REGION_1 AS 'REGION 1',  
  REGION_2 AS 'REGION 2',
  REGION_3 AS 'REGION 3',
  REGION_4 AS 'REGION 4',
  REGION_5 AS 'REGION 5',       
  TOTAL
FROM  (SELECT MONTH(CREATION_DATE)                          AS Month_Num,
     MONTHNAME(CREATION_DATE)                               AS Month,
     COUNT(CASE WHEN REGION_CODE ='01' THEN FEE_NUMBER END) AS REGION_1,
     COUNT(CASE WHEN REGION_CODE ='02' THEN FEE_NUMBER END) AS REGION_2,
     COUNT(CASE WHEN REGION_CODE ='03' THEN FEE_NUMBER END) AS REGION_3,
     COUNT(CASE WHEN REGION_CODE ='04' THEN FEE_NUMBER END) AS REGION_4,
     COUNT(CASE WHEN REGION_CODE ='05' THEN FEE_NUMBER END) AS REGION_5,
     COUNT(*)                                               AS TOTAL 
  FROM   TA_CASES
  WHERE  YEAR(CREATION_DATE)=1998
  GROUP BY Month_Num WITH ROLLUP) AS CA;

The subquery fetched all of the fields that we needed, plus the month number, for sorting.  We then selected from it by placing the code after the FROM of a second query. Performing a query in a two pass process in this way is called Pre (or Partial)-Aggregation. The first pass creates a derived table or resultset that performs most of the aggregation work, while the next pass does some formatting and any additional calculations that may be required.  The neccessitating of pre-aggregation here was the result of two factors: The ROLLUP GROUP BY modifier inserted a Null row into the resultset, which was difficult to replace with the “TOTAL” row header because of its late evaluation in the query process.  A second challenge was presented by the grouping on the output of date functions because grouping on the MONTHNAME() sorted the rows in alphabetical order, rather than chronological. 

The above query produced the following desired crosstab, including chronological row sorting and totals:

Month

REGION 1

REGION 2

REGION 3

REGION 4

REGION 5

TOTAL

April

13

33

76

2

47

171

May

17

55

209

1

143

425

June

8

63

221

1

127

420

July

13

104

240

6

123

486

August

18

121

274

9

111

533

September

25

160

239

2

88

514

October

9

88

295

2

127

521

November

2

86

292

2

120

502

December

1

128

232

6

155

522

TOTAL

106

838

2078

31

1041

4094

Multi-Aggregate Pivots

This query was not terribly complex as it only hit one table and pivoted between two fields: the Month and Region.  That’s called a Single Pivot.  If we wanted to breakdown the time periods further into other time periods, we would now be looking at a two-to-one pivot (eg: months and weeks per region).  On the column side, regions could likewise be broken down into specific cities, giving us a one-to-two pivot.  Adding the new fields to both the columns and rows would produce a true many-to-many multi-aggregate pivot.  The following chart illustrates the inherent complexety of multi-aggregate pivots:

A

B

C1

C2

Total

column header 1

D1

D2

Total

D1

D2

Total

column header 2

E1

E2

Total

E1

E2

Total

E1

E2

Total

E1

E2

Total

column header 3

A1

B1

B2

Total

row sub total

A2

B1

B2

Total

row sub total

Total

row total

row
header
1

row
header
2

col
sub
total

col
sub
total

col
sub
total

col
sub
total

col
sub
total

col
sub
total

col
total

Reporting on Two Row Fields

We’re going to create a Multi-Aggregate row crosstab by adding the Year to the row data, making it the new A row in the above diagram, thus displacing the months to the B field. That will allow us to report on multiple years, by calling the YEAR() function on the CREATION_DATE, much like we did for displaying the months. We’ll insert it as the first field in the SELECT list (new code appears in Red):

...
FROM  (SELECT YEAR(CREATION_DATE)              AS Year,
              MONTH(CREATION_DATE)             AS Month_Num,
              MONTHNAME(CREATION_DATE)         AS Month,
		 ...

We can also easily include totals for each year by adding it to the GROUP BY clause:

       GROUP BY Year, Month_Num WITH ROLLUP) AS CA;

In addition to needing a column heading for the years, we also need to alter the months CASE statement, because rows which display the yearly totals will contain a NULL Month_num value:

SELECT CASE WHEN Year IS NULL
            THEN 'GRAND TOTAL' 
            ELSE Year 
       END      AS 'Year',
       CASE WHEN Month_Num IS NULL     
            THEN CASE WHEN Year IS NULL
                      THEN ''                                         
                      ELSE CONCAT(Year, ' TOTAL')                      
                 END
            ELSE Month 
       END      AS 'Month',  
	 ...

Here then is the full SQL statement to include the Year rows.

SELECT CASE WHEN Year IS NULL
            THEN 'GRAND TOTAL' 
            ELSE Year 
       END      AS 'Year',
       CASE WHEN Month_Num IS NULL     
            THEN CASE WHEN Year IS NULL
                      THEN ''                                         
                      ELSE CONCAT(Year, ' TOTAL')                      
                 END
            ELSE Month 
       END      AS 'Month',  
       REGION_1 AS 'REGION 1',  
       REGION_2 AS 'REGION 2',
       REGION_3 AS 'REGION 3',
       REGION_4 AS 'REGION 4',
       REGION_5 AS 'REGION 5',       
       TOTAL
FROM  (SELECT YEAR(CREATION_DATE)                                    AS Year,
              MONTH(CREATION_DATE)                                   AS Month_Num,
              CONVERT(MONTHNAME(CREATION_DATE) USING latin1)         AS Month,
              COUNT(CASE WHEN REGION_CODE ='01' THEN FEE_NUMBER END) AS REGION_1,
              COUNT(CASE WHEN REGION_CODE ='02' THEN FEE_NUMBER END) AS REGION_2,
              COUNT(CASE WHEN REGION_CODE ='03' THEN FEE_NUMBER END) AS REGION_3,
              COUNT(CASE WHEN REGION_CODE ='04' THEN FEE_NUMBER END) AS REGION_4,
              COUNT(CASE WHEN REGION_CODE ='05' THEN FEE_NUMBER END) AS REGION_5,
              COUNT(*)                                               AS TOTAL 
       FROM   TA_CASES
       WHERE  YEAR(CREATION_DATE)>2003
       GROUP BY Year, Month_Num WITH ROLLUP) AS CA;

The revised SQL code produces the following result, which includes the YEAR headers in the first column, and the TOTAL summary row for each year:

Year

Month

REGION 1

REGION 2

REGION 3

REGION 4

REGION 5

TOTAL

2004

January

8

41

156

1

42

248

2004

February

1

38

140

0

29

212

2004

March

0

44

115

0

50

209

2004

April

4

45

119

0

42

210

2004

May

1

57

151

0

84

294

2004

June

2

63

142

0

48

259

2004

July

6

47

110

0

33

199

2004

August

10

38

150

1

53

256

2004

September

3

45

146

7

34

235

2004

October

2

55

112

0

34

204

2004

November

4

36

110

1

43

195

2004

December

1

30

165

0

31

227

2004

2004 TOTAL

42

539

1616

10

523

2748

2005

January

3

34

107

2

34

182

2005

February

0

24

103

0

15

144

2005

March

1

30

101

0

24

159

2005

April

3

27

149

0

27

212

2005

May

1

36

110

0

27

177

2005

June

3

52

113

0

28

196

2005

July

1

38

131

3

22

195

2005

August

4

51

146

1

36

238

2005

September

9

52

149

0

49

259

2005

October

10

29

100

0

47

186

2005

November

11

11

162

0

18

204

2005

December

4

66

142

0

19

231

2005

2005 TOTAL

50

450

1513

6

346

2383

2006

January

2

68

132

0

30

235

2006

February

1

43

94

0

24

162

2006

March

3

30

134

0

29

196

2006

April

1

47

129

0

21

199

2006

May

11

52

124

0

31

220

2006

June

9

49

126

0

30

214

2006

July

4

43

125

0

42

217

2006

August

3

50

132

0

51

241

2006

September

6

56

149

0

45

262

2006

October

5

31

121

2

25

184

2006

November

3

42

146

3

63

258

2006

December

0

47

115

0

23

194

2006

2006 TOTAL

48

558

1527

5

414

2582

2007

January

2

37

152

1

35

229

2007

February

1

35

86

2

31

156

2007

March

5

78

132

0

51

268

2007

April

2

41

125

1

41

211

2007

May

0

50

122

0

33

206

2007

June

1

63

107

0

38

209

2007

July

5

41

65

1

31

150

2007

August

0

63

110

2

43

219

2007

September

2

35

134

0

55

227

2007

October

2

39

120

2

52

225

2007

November

10

22

141

0

36

215

2007

December

51

26

60

0

17

156

2007

2007 TOTAL

81

530

1354

9

463

2471

2008

January

1

52

154

1

49

277

2008

February

0

12

84

1

34

136

2008

March

0

30

85

1

27

148

2008

April

3

22

73

0

40

168

2008

May

0

48

90

2

29

170

2008

June

10

63

122

0

24

225

2008

July

19

38

148

3

28

238

2008

August

54

50

105

0

20

230

2008

September

42

34

143

2

44

268

2008

October

73

41

112

1

26

253

2008

November

11

15

101

0

20

152

2008

December

285

62

160

2

29

541

2008

2008 TOTAL

498

467

1377

13

370

2806

2009

January

686

27

123

3

42

882

2009

February

2

9

87

0

22

120

2009

March

4

19

106

0

25

154

2009

April

0

10

95

0

12

117

2009

May

4

31

93

0

21

151

2009

June

63

71

94

0

21

251

2009

July

1

42

92

6

21

165

2009

August

3

53

116

3

19

195

2009

September

1

12

25

0

11

49

2009

December

0

0

1

0

10

17

2009

2009 TOTAL

764

274

832

12

204

2101

GRAND TOTAL

 

1483

2818

8219

55

2320

15091

Adding row fields is not all that difficult because queries naturally group data by rows. Therefore, all that is required is to add the new row to the SELECT and GROUP BY field lists and include a column for the row headers, as we did above. In the next installment, we will take on the more imposing challenge of adding a new column. It’s not as straighforward as might initially appear. 

Additional Resources

MySQL

» See All Articles by Columnist Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles