Rob Gravelle demonstrates a solution for passing MySQL dynamic field names to the group_concat function using a combination of stored proc input parameters, string concatenation, and prepared statements.
Although there is no
direct support for dynamic declaration of column and table names in MySQL
stored procedures, by utilizing user variables and prepared statements we can
achieve the same result. In the MySQL
Input Parameters Add Flexibility to Crosstab Stored Procedures article, we
created six input parameters in order to pass columns and even aggregate
functions to our proc. So far, weve added every field except for the
REGION_CODE. That one will take some doing, so we will make it our prime focus
today.
The SortColumnField Parameter
The SortColumnField
parameter will replace the hard-coded REGION_CODE in the proc and
will set the column sorting, much like the SortRowField
sets the row sorting (the new parameter appears in Red):
CREATE PROCEDURE `p_case_counts_per_region_by_month` (IN AggregateFunction VARCHAR(64),
IN ColumnField VARCHAR(64),
IN RowField VARCHAR(64),
IN ColumnFieldAlias VARCHAR(64),
IN RowFieldAlias VARCHAR(64),
IN SortRowField VARCHAR(64),
IN SortColumnField VARCHAR(64))
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
BEGIN
SELECT concat(
"SELECT CASE WHEN sort_row_field IS NULL", "\n",
" THEN 'TOTAL'", "\n",
" ELSE row_field", "\n",
" END AS '", RowFieldAlias, "',", "\n",
group_concat(DISTINCT concat(" agg_fld_", ColumnField,
" AS '", ColumnFieldAlias, ColumnField, "',", "\n"
)
order by ColumnField
separator ''
),
" TOTAL", "\n",
"FROM ( SELECT ", SortRowField, "\t\t\t\t\t\t\t\tAS sort_row_field," , "\n",
"\t\t", RowField, " \t\t\t\t\t\tAS row_field,", "\n",
group_concat(
DISTINCT concat("\t\t\t\t", AggregateFunction, "(CASE WHEN ", SortColumnField, " ='", SortColumnField,
"' THEN ", ColumnField, " ELSE NULL END) AS 'agg_fld_",
SortColumnField, "',", "\n"
)
order by SortColumnField
separator ''
),
" ", AggregateFunction, "(", ColumnField, ")\t\t\t\t\tAS 'TOTAL'", "\n",
" FROM TA_CASES", "\n",
" WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
" GROUP BY SortRowField WITH ROLLUP) AS CA;"
) statement
into @sql_stmt
FROM TA_CASES
WHERE YEAR(CREATION_DATE)=1998;
prepare case_counts_per_region_by_month
from @case_counts_per_region_by_month;
execute case_counts_per_region_by_month;
deallocate prepare case_counts_per_region_by_month;
END
Lets call
the updated proc with the extra parameter, passing in the REGION_CODE field:
mysql>call p_case_counts_per_region_by_month("COUNT",
--> "FEE_NUMBER",
--> "MONTHNAME(CREATION_DATE)",
--> "REGION",
--> "Month",
--> "MONTH(CREATION_DATE)",
--> "REGION_CODE");
For some
reason, the proc does not work as expected, and throws the following error!
mysql>SQL Error: Unknown column 'AggFld_REGION_CODE' in 'field list'
A closer
look at the dynamically generated SQL reveals the source of the error. The
last REGION_CODE parameter is being interpreted as a string literal rather
than a field name! Recall that, while most of the code is part of a concat()
function, the group_concat() is accessing the field directly from the TA_CASES
table. The following code snippet highlights direct references to the
REGION_CODE field within the group_concat() functions:
group_concat( DISTINCT concat(" REGION_", REGION_CODE,
" AS 'REGION ", REGION_CODE, "',", "\n"
)
order by REGION_CODE
separator ''
),
...
group_concat(
DISTINCT concat("\t\t\t\tCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE,
"' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_",
REGION_CODE, "',", "\n"
)
order by REGION_CODE
separator ''
),
...
FROM TA_CASES
WHERE YEAR(CREATION_DATE)=1998;
A closer
look at the generated SQL statement confirms the above assessment. While the
case statements are coming out exactly as expected, the outer SELECT field
values contain literal strings instead of the REGION_CODE fields value:
SELECT CASE WHEN SortRowField IS NULL
THEN 'TOTAL'
ELSE RowField
END AS 'Month',
AggFld_REGION_CODE AS 'REGION REGION_CODE',
TOTAL
FROM ( SELECT MONTH(CREATION_DATE) AS SortRowField,
MONTHNAME(CREATION_DATE) AS RowField,
COUNT(CASE WHEN REGION_CODE ='01' THEN FEE_NUMBER ELSE NULL END) AS 'AggFld_01',
COUNT(CASE WHEN REGION_CODE ='02' THEN FEE_NUMBER ELSE NULL END) AS 'AggFld_02',
COUNT(CASE WHEN REGION_CODE ='03' THEN FEE_NUMBER ELSE NULL END) AS 'AggFld_03',
COUNT(CASE WHEN REGION_CODE ='04' THEN FEE_NUMBER ELSE NULL END) AS 'AggFld_04',
COUNT(CASE WHEN REGION_CODE ='05' THEN FEE_NUMBER ELSE NULL END) AS 'AggFld_05',
COUNT(FEE_NUMBER )
FROM TA_CASES
WHERE YEAR(CREATION_DATE)=1998
GROUP BY SortRowField WITH ROLLUP) AS CA;
Proposed Solution: Generate the Outer SQL Statement
We can get
around this issue in the same way that we did with the other fields, by creating
a prepared statement and then executing it to retrieve the query results. The
challenge here is to insert the existing dynamically generated strings into the
new one while preserving the syntax.
The Escape Character
Inserting
a string within another takes a solid grasp of character escaping. In computer languages,
an escape character is used to invoke an alternative interpretation on the subsequent
character(s) in a character sequence. Usually it is used to toggle a
characters default behavior between being interpreted as just a character, or
one with special meaning or function. MySQL uses the backslash character (\).
Weve already used the escape character for newlines (\n) and tabs (\t). Here
is the complete list of escape characters in MySQL:
|
\0
|
An
ASCII NUL (0x00) character.
|
|
\'
|
A
single quote (') character.
|
|
\"
|
A
double quote (") character.
|
|
\b
|
A
backspace character.
|
|
\n
|
A
newline (linefeed) character.
|
|
\r
|
A
carriage return character.
|
|
\t
|
A
tab character.
|
|
\Z
|
ASCII
26 (Control-Z). See note following the table.
|
|
\\
|
A
backslash (\) character.
|
|
\%
|
A
% character. See note following the table.
|
|
\_
|
A
_ character. See note following the table.
|
For all other
escape sequences, the backslash is ignored. That is, the escaped character is
interpreted as if it was not escaped. For example, \x becomes just x.
Writing Quote Characters within a String
Strings that
contain quotes require extra consideration, as there are several ways to
include quote characters within a string:
-
A ' inside a string
quoted with ' may be written as ''.
-
A " inside a
string quoted with " may be written as "".
-
Precede the quote character by
an escape character (\).
-
A ' inside a string
quoted with " needs no special treatment and need not be doubled
or escaped. In the same way, "
inside a string quoted with ' needs no special treatment.
The
following SELECT statements demonstrate how quoting and escaping work:
mysql> SELECT 'hello', '"hello"', '""hello""', 'hel''lo', '\'hello';
+-------+---------+-----------+--------+--------+
| hello | "hello" | ""hello"" | hel'lo | 'hello |
+-------+---------+-----------+--------+--------+
mysql> SELECT "hello", "'hello'", "''hello''", "hel""lo", "\"hello";
+-------+---------+-----------+--------+--------+
| hello | 'hello' | ''hello'' | hel"lo | "hello |
+-------+---------+-----------+--------+--------+
For consistency, I am keeping
with the backslash character for all character escaping.
Rather than manually
searching for instances of each field or quotes, it may be easier to use a
Search and Replace function such as those found in most text editors:
To get you
started, heres the code for the inner and outer SELECT strings, along with the
first instance of string within a string syntax to assign the SQL code into the
@sql_stmt variable:
SELECT concat(
"SELECT concat(", "\n",
" \"SELECT CASE WHEN sort_row_alias IS NULL\", \"\\n\",", "\n",
...
") statement", "\n",
"into @sql_stmt", "\n",
"FROM TA_CASES", "\n",
"WHERE YEAR(CREATION_DATE)=1998;"
) statement
into @sql_stmt;
Be sure to
build up the SQL string slowly and test the code often so that you catch syntax
errors as they come up. If you have access to a text editor that supports
syntax highlighting, such as TextPad, I
would suggest that you make use of that feature. It will make your life a
whole lot easier!
Fetching
the data will now require an additional step because the first execution of the
prepared statement will only get us to the original outer SQL statement. It
must also be executed to arrive at the final SQL:
CREATE PROCEDURE `p_case_counts_per_region_by_month` (IN AggregateFunction VARCHAR(64),
IN ColumnField VARCHAR(64),
IN RowField VARCHAR(64),
IN ColumnFieldAlias VARCHAR(64),
IN RowFieldAlias VARCHAR(64),
IN SortRowField VARCHAR(64),
IN SortColumnField VARCHAR(64))
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
BEGIN
SELECT concat(
"SELECT concat(", "\n",
" \"SELECT CASE WHEN sort_row_alias IS NULL\", \"\\n\",", "\n",
" \" THEN 'TOTAL'\", \"\\n\",", "\n",
" \" ELSE row_name\", \"\\n\",", "\n",
" \" END AS '", RowFieldAlias, "',\", \"\\n\",", "\n",
" group_concat( DISTINCT concat(\t\" AggFld_\", ", SortColumnField, ",", "\n",
"\t\t\t\t\t\t\t\t\t\" AS '", ColumnFieldAlias, " \", ", SortColumnField, ", \"',\", \"\\n\"", "\n",
" )", "\n",
" order by ", SortColumnField, "\n",
" separator ''", "\n",
" ),", "\n",
" \" TOTAL\", \"\\n\",", "\n",
" \"FROM ( SELECT ", SortRowField, "\t\t\t\t\t\t\t\tAS sort_row_alias,\", \"\\n\",", "\n",
" \" ", RowField, "\t\t\t\t\t\t\t\tAS 'row_name',\", \"\\n\",", "\n",
" group_concat(", "\n",
" DISTINCT concat(\"\t\t\t\t", AggregateFunction, "(CASE WHEN ", SortColumnField, " ='\", ", SortColumnField, ",", "\n",
" \"' THEN ", ColumnField, " ELSE NULL END) AS 'AggFld_\",", "\n"
" ", SortColumnField, ", \"',\", \"\\n\"", "\n",
" )", "\n"
" order by ", SortColumnField, "\n",
" separator ''", "\n",
" ),", "\n",
" \" ", AggregateFunction, "(", ColumnField, ")\t\t\t\t\tAS 'TOTAL'\", \"\\n\",", "\n",
" \" FROM TA_CASES\", \"\\n\",", "\n",
" \" WHERE YEAR(CREATION_DATE)=\", YEAR(CREATION_DATE), \"\\n\",", "\n",
" \" GROUP BY row_name WITH ROLLUP) AS CA;\"", "\n",
") statement", "\n",
"into @sql_stmt", "\n",
"FROM TA_CASES", "\n",
"WHERE YEAR(CREATION_DATE)=1998;"
) statement
into @sql_stmt;
prepare case_counts_per_region_by_month_stmt
from @sql_stmt;
execute case_counts_per_region_by_month_stmt;
deallocate prepare case_counts_per_region_by_month_stmt;
prepare case_counts_per_region_by_month_stmt
from @sql_stmt;
execute case_counts_per_region_by_month_stmt;
deallocate prepare case_counts_per_region_by_month_stmt;
END
To Format or Not to Format?
You may
find it easier to omit the newline and tab formatting codes, as they are not
necessary for the SQL to run. They are there for our benefit; so that we can
easily read the generated SQL should we run into any trouble. We can remove the
last prepared statement and select the @sql_stmt variable to verify that
the SQL matches what we had previously:
SELECT concat(
"SELECT concat(", "\n",
" \"SELECT CASE WHEN sort_row_alias IS NULL\", \"\\n\",", "\n",
...
"into @sql_stmt", "\n",
"FROM ", Tables, "\n",
"WHERE ", Criteria, ";"
) statement
into @sql_stmt;
--replace the prepared statement code with the following line to
--examine the generated SQL code:
select @sql_stmt;
Including the Table and Criteria Parameters
To make
the proc truly dynamic, we have to add input parameters for the table(s) and
select criteria. This would allow us to include multiple tables and specify complex
selection criteria:
CREATE PROCEDURE `p_case_counts_per_region_by_month` (IN AggregateFunction VARCHAR(64),
IN ColumnField VARCHAR(64),
IN RowField VARCHAR(64),
IN ColumnFieldAlias VARCHAR(64),
IN RowFieldAlias VARCHAR(64),
IN SortRowField VARCHAR(64),
IN SortColumnField VARCHAR(64),
IN Tables VARCHAR(64),
IN Criteria VARCHAR(64))
LANGUAGE SQL
NOT DETERMINISTIC
CONTAINS SQL
SQL SECURITY DEFINER
BEGIN
SELECT concat(
"SELECT concat(", "\n",
" \"SELECT CASE WHEN sort_row_alias IS NULL\", \"\\n\",", "\n",
" \" THEN 'TOTAL'\", \"\\n\",", "\n",
" \" ELSE row_name\", \"\\n\",", "\n",
" \" END AS '", RowFieldAlias, "',\", \"\\n\",", "\n",
" group_concat( DISTINCT concat(\t\" AggFld_\", ", SortColumnField, ",", "\n",
"\t\t\t\t\t\t\t\t\t\" AS '", ColumnFieldAlias, " \", ", SortColumnField, ", \"',\", \"\\n\"", "\n",
" )", "\n",
" order by ", SortColumnField, "\n",
" separator ''", "\n",
" ),", "\n",
" \" TOTAL\", \"\\n\",", "\n",
" \"FROM ( SELECT ", SortRowField, "\t\t\t\t\t\t\t\tAS sort_row_alias,\", \"\\n\",", "\n",
" \" ", RowField, "\t\t\t\t\t\t\t\tAS 'row_name',\", \"\\n\",", "\n",
" group_concat(", "\n",
" DISTINCT concat(\"\t\t\t\t", AggregateFunction, "(CASE WHEN ", SortColumnField, " ='\", ", SortColumnField, ",", "\n",
" \"' THEN ", ColumnField, " ELSE NULL END) AS 'AggFld_\",", "\n"
" ", SortColumnField, ", \"',\", \"\\n\"", "\n",
" )", "\n"
" order by ", SortColumnField, "\n",
" separator ''", "\n",
" ),", "\n",
" \" ", AggregateFunction, "(", ColumnField, ")\t\t\t\t\tAS 'TOTAL'\", \"\\n\",", "\n",
" \" FROM ", Tables, "\", \"\\n\",", "\n",
" \" WHERE ", Criteria, "\", \"\\n\",", "\n",
" \" GROUP BY row_name WITH ROLLUP) AS CA;\"", "\n",
") statement", "\n",
"into @sql_stmt", "\n",
"FROM ", Tables, "\n",
"WHERE ", Criteria, ";"
) statement
into @sql_stmt;
prepare case_counts_per_region_by_month_stmt
from @sql_stmt;
execute case_counts_per_region_by_month_stmt;
deallocate prepare case_counts_per_region_by_month_stmt;
prepare case_counts_per_region_by_month_stmt
from @sql_stmt;
execute case_counts_per_region_by_month_stmt;
deallocate prepare case_counts_per_region_by_month_stmt;
END
There are
other solutions than the one that is presented here, but well leave those for
another day. Having accomplished our goal for this series, which was to
explore how Crosstab Queries could be generated within MySQL, we will be moving
on to something fresh next time.
Additional Reasources
simple-talk
Roland Bouman's blog
»
See All Articles by Columnist
Rob Gravelle