MySQL Reporting requirements sometimes require both unknown column and row values, necessitating a more powerful means of generating crosstabs. Today’s article presents Prepared Statements, which dynamically generate the SQL and assign it to a variable so that we can tailor the output based on the number of data values.
During the past
several weeks, we’ve been learning how to create crosstabs in MySQL. We’ve now
covered the basics of fetching the row and column data, as well as how to
overcome some of the challenges inherent to outputting data as columns. As we
saw, organizing data into columns can be an arduous task due to the SQL
language’s natural tendency of appending data to rows. We can transpose row
data to columns easily enough, but the number of possible data values in the horizontal
axis need to be known before hand.
Unfortunately, there
will be times that your reporting requirements will require both unknown column
and row values, or have a tendency to change often enough to invalidate
previous code. In such instances, you need a more powerful means of generating
crosstabs. Today’s article presents just such a mechanism: Prepared Statements. By dynamically
generating the SQL and assigning it to a variable, we can tailor the output based
on the number of data values, thus unburdening us from having to anticipate
changes.
How Static Is the Data Really?
Way back in the Tips
for Simplifying Crosstab Query Statements article, we were introduced to a
crosstab report that displayed the number of cases by Region broken down by month,
and later, by year as well. No sooner was the article published than someone
asked the question "What happens when a new Region is introduced?"
The answer is simple: an extra column must be added to the SELECT field list.
This is straightforward enough to do and can be expected to occur very rarely,
as Regions are what you would call a static data set. Other than
Regions, other geographic entities, including continents, countries, counties,
provinces, states, and cities can also be considered to be static. Having said
that, even fixed data sets such as time frames can vary enormously. Now I’m
not referring to the elasticity of time-space, as discovered by Einstein, but rather,
how the start and end points of a SELECT query depend on reporting needs. With
regards to our own crosstab query, going from a single year to multiple ones
necessitated many changes to the SQL.
At the other end of
the spectrum is variable data, which can change drastically from one report to
the other. Imagine reporting on Starbucks coffee houses in the nineties boom period!
Since you could expect the number of shops to increase on an almost daily
basis, you’d definitely need a more flexible approach!
Steps in Converting the Query into a Prepared Statement
Going from an SQL
statement to a Prepared Statement will be done in two steps:
- First,
we’ll rewrite the query to generate the Prepared Statement whose output will
vary according to the number of columns. - Second,
we’ll insert the SQL generating Prepared Statement into a stored proc, so that
we can create the Prepared Statement and execute it in one fell swoop.
Dynamically Generating the SQL Statement
In order to
dynamically generate an SQL string, we’ll be using the CONCAT() and
GROUP_CONCAT() string functions.
The CONCAT() function
accepts a variable number of string parameters, and returns another string,
which is comprised of all the input parameters joined together in the order that
they were passed in. The following code would concatenate a name in last
name (comma) first name format:
SELECT CONCAT(last_name, ", ", first_name) AS NAME FROM CLIENTS; Produces: NAME _____________ Jones, Paul McDonald, Jim Miller, Bruce Portman, Bess
The GROUP_CONCAT() function returns a
string result with the concatenated non-NULL values from a group. Here, it’s
used to aggregate all the
rows from the TA_CASES
table
and return the collection of SELECT list expressions that makes up the horizontal axis of
the crosstab. The following query returns a string value that replaces the SQL statement
of our previous crosstab query:
SELECT concat( "SELECT CASE WHEN Month_Num IS NULL", "n", " THEN 'TOTAL'", "n", " ELSE Month", "n", " END AS 'Month',", "n", group_concat( DISTINCT concat(" REGION_", REGION_CODE, " AS 'REGION ", REGION_CODE, "',", "n" ) order by REGION_CODE separator '' ), " TOTAL", "n", "FROM ( SELECT MONTH(CREATION_DATE)ttttttttAS Month_Num,", "n", "ttMONTHNAME(CREATION_DATE)tttttttttAS 'Month',", "n", group_concat( DISTINCT concat("ttttCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE, "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", REGION_CODE, "',", "n" ) order by REGION_CODE separator '' ), " COUNT(*)tttttttttttttAS 'TOTAL'", "n", " FROM TA_CASES", "n", " WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "n", " GROUP BY Month_Num WITH ROLLUP) AS CA;" ) statement FROM TA_CASES WHERE YEAR(CREATION_DATE)=1998;
Here is the resulting SQL code as created by our dynamic
SQL generator:
SELECT CASE WHEN Month_Num IS NULL THEN 'TOTAL' ELSE Month END AS 'Month', REGION_01 AS 'REGION 01', REGION_02 AS 'REGION 02', REGION_03 AS 'REGION 03', REGION_04 AS 'REGION 04', REGION_05 AS 'REGION 05', TOTAL FROM (SELECT MONTH(CREATION_DATE) MONTHNAME(CREATION_DATE) COUNT(CASE WHEN REGION_CODE ='01' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_01', COUNT(CASE WHEN REGION_CODE ='02' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_02', COUNT(CASE WHEN REGION_CODE ='03' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_03', COUNT(CASE WHEN REGION_CODE ='04' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_04', COUNT(CASE WHEN REGION_CODE ='05' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_05', COUNT(*) FROM TA_CASES WHERE YEAR(CREATION_DATE)=1998 GROUP BY Month_Num WITH ROLLUP) AS CA;
Moving the Prepared Statement into a Stored Procedure
Placing our code in a
stored proc will make running the query a lot easier as it can generate the
statement and then execute it to retrieve the results. Here is the code for
the stored proc:
CREATE PROCEDURE `p_case_counts_per_region_by_month`() LANGUAGE SQL NOT DETERMINISTIC CONTAINS SQL SQL SECURITY DEFINER BEGIN SELECT concat( "SELECT CASE WHEN Month_Num IS NULL", "n", " THEN 'TOTAL'", "n", " ELSE Month", "n", " END AS 'Month',", "n", group_concat( DISTINCT concat(" REGION_", REGION_CODE, " AS 'REGION ", REGION_CODE, "',", "n" ) order by REGION_CODE separator '' ), " TOTAL", "n", "FROM ( SELECT MONTH(CREATION_DATE)ttttttttAS Month_Num,", "n", "ttMONTHNAME(CREATION_DATE)tttttttttAS 'Month',", "n", group_concat( DISTINCT concat("ttttCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE, "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", REGION_CODE, "',", "n" ) order by REGION_CODE separator '' ), " COUNT(*)tttttttttttttAS 'TOTAL'", "n", " FROM TA_CASES", "n", " WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "n", " GROUP BY Month_Num WITH ROLLUP) AS CA;" ) statement into @case_counts_per_region_by_month FROM TA_CASES WHERE YEAR(CREATION_DATE)=1998; prepare case_counts_per_region_by_month from @case_counts_per_region_by_month; execute case_counts_per_region_by_month; deallocate prepare case_counts_per_region_by_month; END
Inside the
procedure, we generate the SQL for the query as we did above, but within a proc
we can save it to a variable using the SELECT INTO syntax. A Prepared Statement
is then utilized to execute the generated code.
A SELECT INTO can
only be used where the SQL returns exactly one row; yet another reason that
generating the SQL statement as a string works so well!
A Prepared Statement
is a combination of three separate SQL statements:
-
PREPARE
prepares a statement for execution. -
EXECUTE
executes a prepared statement. -
DEALLOCATE
PREPARE releases a prepared statement.
Once the proc has
been created, all we need to do is call it by entering the following command
line:
mysql> call p_case_counts_per_region_by_month;
Here is the record
set that is returned by our proc:
Month |
REGION 01 |
REGION 02 |
REGION 03 |
REGION 04 |
REGION 05 |
TOTAL |
April |
13 |
33 |
76 |
2 |
47 |
171 |
May |
17 |
55 |
209 |
1 |
143 |
425 |
June |
8 |
63 |
221 |
1 |
127 |
420 |
July |
13 |
104 |
240 |
6 |
123 |
486 |
August |
18 |
121 |
274 |
9 |
111 |
533 |
September |
25 |
160 |
239 |
2 |
88 |
514 |
October |
9 |
88 |
295 |
2 |
127 |
521 |
November |
2 |
86 |
292 |
2 |
120 |
502 |
December |
1 |
128 |
232 |
6 |
155 |
522 |
TOTAL |
106 |
838 |
2078 |
31 |
1041 |
4094 |
In the last
installment of the Crosstab series, we’ll make the proc more generic by moving
the columns and table(s) names that we’re reporting on to input parameters.