Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MySQL

Posted Mar 26, 2010

MySQL Prepared Statements to Generate Crosstab SQL

By Rob Gravelle

MySQL Reporting requirements sometimes require both unknown column and row values, necessitating a more powerful means of generating crosstabs. Today's article presents Prepared Statements, which dynamically generate the SQL and assign it to a variable so that we can tailor the output based on the number of data values.

During the past several weeks, we've been learning how to create crosstabs in MySQL. We've now covered the basics of fetching the row and column data, as well as how to overcome some of the challenges inherent to outputting data as columns. As we saw, organizing data into columns can be an arduous task due to the SQL language's natural tendency of appending data to rows. We can transpose row data to columns easily enough, but the number of possible data values in the horizontal axis need to be known before hand.

Unfortunately, there will be times that your reporting requirements will require both unknown column and row values, or have a tendency to change often enough to invalidate previous code. In such instances, you need a more powerful means of generating crosstabs. Today's article presents just such a mechanism: Prepared Statements. By dynamically generating the SQL and assigning it to a variable, we can tailor the output based on the number of data values, thus unburdening us from having to anticipate changes.

How Static Is the Data Really?

Way back in the Tips for Simplifying Crosstab Query Statements article, we were introduced to a crosstab report that displayed the number of cases by Region broken down by month, and later, by year as well. No sooner was the article published than someone asked the question "What happens when a new Region is introduced?" The answer is simple: an extra column must be added to the SELECT field list. This is straightforward enough to do and can be expected to occur very rarely, as Regions are what you would call a static data set. Other than Regions, other geographic entities, including continents, countries, counties, provinces, states, and cities can also be considered to be static. Having said that, even fixed data sets such as time frames can vary enormously. Now I'm not referring to the elasticity of time-space, as discovered by Einstein, but rather, how the start and end points of a SELECT query depend on reporting needs. With regards to our own crosstab query, going from a single year to multiple ones necessitated many changes to the SQL.

At the other end of the spectrum is variable data, which can change drastically from one report to the other. Imagine reporting on Starbucks coffee houses in the nineties boom period! Since you could expect the number of shops to increase on an almost daily basis, you’d definitely need a more flexible approach!

Steps in Converting the Query into a Prepared Statement

Going from an SQL statement to a Prepared Statement will be done in two steps:

  • First, we'll rewrite the query to generate the Prepared Statement whose output will vary according to the number of columns.
  • Second, we'll insert the SQL generating Prepared Statement into a stored proc, so that we can create the Prepared Statement and execute it in one fell swoop.

Dynamically Generating the SQL Statement

In order to dynamically generate an SQL string, we’ll be using the CONCAT() and GROUP_CONCAT() string functions.

The CONCAT() function accepts a variable number of string parameters, and returns another string, which is comprised of all the input parameters joined together in the order that they were passed in. The following code would concatenate a name in last name (comma) first name format:

SELECT CONCAT(last_name, ", ", first_name) AS NAME
FROM   CLIENTS;
Produces:
NAME
_____________
Jones, Paul
McDonald, Jim
Miller, Bruce
Portman, Bess

The GROUP_CONCAT() function returns a string result with the concatenated non-NULL values from a group. Here, it's used to aggregate all the rows from the TA_CASES table and return the collection of SELECT list expressions that makes up the horizontal axis of the crosstab. The following query returns a string value that replaces the SQL statement of our previous crosstab query:

SELECT concat(
    "SELECT CASE WHEN Month_Num IS NULL", "\n", 
    "            THEN 'TOTAL'", "\n", 
    "            ELSE Month", "\n", 
    "       END        AS 'Month',", "\n",
    group_concat( DISTINCT concat("       REGION_", REGION_CODE, 
                                  "  AS 'REGION ", REGION_CODE, "',", "\n"
                           )
                  order by REGION_CODE
                  separator '' 
                ),
    "       TOTAL", "\n",
    "FROM  (     SELECT   MONTH(CREATION_DATE)\t\t\t\t\t\t\t\tAS Month_Num,", "\n",
    "\t\tMONTHNAME(CREATION_DATE)\t\t\t\t\t\t\t\t\tAS 'Month',", "\n",  
    group_concat( 
        DISTINCT concat("\t\t\t\tCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE, 
                        "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", 
                        REGION_CODE, "',", "\n"
                 )
        order by REGION_CODE
        separator '' 
    ),
    "            COUNT(*)\t\t\t\t\t\t\t\t\t\t\t\t\tAS 'TOTAL'", "\n",
    "            FROM  TA_CASES", "\n",
    "            WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
    "            GROUP BY Month_Num WITH ROLLUP) AS CA;"
) statement
FROM TA_CASES
WHERE YEAR(CREATION_DATE)=1998;

Here is the resulting SQL code as created by our dynamic SQL generator:

SELECT CASE WHEN Month_Num IS NULL
            THEN 'TOTAL'
            ELSE Month
       END        AS 'Month',
       REGION_01  AS 'REGION 01',
       REGION_02  AS 'REGION 02',
       REGION_03  AS 'REGION 03',
       REGION_04  AS 'REGION 04',
       REGION_05  AS 'REGION 05',
       TOTAL
FROM  (SELECT    MONTH(CREATION_DATE)
         MONTHNAME(CREATION_DATE)  
                 COUNT(CASE WHEN REGION_CODE ='01' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_01',
                 COUNT(CASE WHEN REGION_CODE ='02' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_02',
                 COUNT(CASE WHEN REGION_CODE ='03' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_03',
                 COUNT(CASE WHEN REGION_CODE ='04' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_04',
         COUNT(CASE WHEN REGION_CODE ='05' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_05',
                 COUNT(*) 
       FROM  TA_CASES
       WHERE YEAR(CREATION_DATE)=1998
       GROUP BY Month_Num WITH ROLLUP) AS CA;

Moving the Prepared Statement into a Stored Procedure

Placing our code in a stored proc will make running the query a lot easier as it can generate the statement and then execute it to retrieve the results. Here is the code for the stored proc:

CREATE PROCEDURE `p_case_counts_per_region_by_month`() 
LANGUAGE SQL 
NOT DETERMINISTIC 
CONTAINS SQL 
SQL SECURITY DEFINER 
BEGIN  
  SELECT concat(
    "SELECT CASE WHEN Month_Num IS NULL", "\n", 
    "            THEN 'TOTAL'", "\n", 
    "            ELSE Month", "\n", 
    "       END        AS 'Month',", "\n",
    group_concat( DISTINCT concat("       REGION_", REGION_CODE, 
                                  "  AS 'REGION ", REGION_CODE, "',", "\n"
                           )
                  order by REGION_CODE
                  separator '' 
                ),
    "       TOTAL", "\n",
    "FROM  (     SELECT   MONTH(CREATION_DATE)\t\t\t\t\t\t\t\tAS Month_Num,", "\n",
    "\t\tMONTHNAME(CREATION_DATE)\t\t\t\t\t\t\t\t\tAS 'Month',", "\n",  
    group_concat( 
        DISTINCT concat("\t\t\t\tCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE, 
                        "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", 
                        REGION_CODE, "',", "\n"
                 )
        order by REGION_CODE
        separator '' 
    ),
    "            COUNT(*)\t\t\t\t\t\t\t\t\t\t\t\t\tAS 'TOTAL'", "\n",
    "            FROM  TA_CASES", "\n",
    "            WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
    "            GROUP BY Month_Num WITH ROLLUP) AS CA;"
  ) statement
  into @case_counts_per_region_by_month
  FROM TA_CASES
  WHERE YEAR(CREATION_DATE)=1998;

  prepare case_counts_per_region_by_month   
  from @case_counts_per_region_by_month;    
  execute case_counts_per_region_by_month;   
  deallocate prepare case_counts_per_region_by_month; 
END

Inside the procedure, we generate the SQL for the query as we did above, but within a proc we can save it to a variable using the SELECT INTO syntax. A Prepared Statement is then utilized to execute the generated code.

A SELECT INTO can only be used where the SQL returns exactly one row; yet another reason that generating the SQL statement as a string works so well!

A Prepared Statement is a combination of three separate SQL statements:

  • PREPARE prepares a statement for execution.
  • EXECUTE executes a prepared statement.
  • DEALLOCATE PREPARE releases a prepared statement.

Once the proc has been created, all we need to do is call it by entering the following command line:

mysql> call p_case_counts_per_region_by_month;

Here is the record set that is returned by our proc:

Month

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

April

13

33

76

2

47

171

May

17

55

209

1

143

425

June

8

63

221

1

127

420

July

13

104

240

6

123

486

August

18

121

274

9

111

533

September

25

160

239

2

88

514

October

9

88

295

2

127

521

November

2

86

292

2

120

502

December

1

128

232

6

155

522

TOTAL

106

838

2078

31

1041

4094

In the last installment of the Crosstab series, we'll make the proc more generic by moving the columns and table(s) names that we're reporting on to input parameters.

» See All Articles by Columnist Rob Gravelle



MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MySQL Forum
Topic By Replies Updated
MySQL in high availability and transction secure banking application klamor 1 August 28th, 10:24 AM
MySQL rollback UAL225 1 August 28th, 10:15 AM
Browsing a DB file that uses MySql finleytech 1 July 26th, 10:51 AM
php cookie won't pass variable to next page ITdevGirl 0 June 22nd, 12:13 PM


















Thanks for your registration, follow us on our social networks to keep up-to-date