Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Tips Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Tips
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MySQL

Posted December 8, 2016

Why You Should be Using the MySQL GROUP_CONCAT Function

By Rob Gravelle

You know when you’ve got some good news to share and you just want to blurt it out?  I feel like that right now, but I’m going to refrain from spilling the beans just yet.  This is going to benefit from some examples and perhaps even a diagram or two.  For those reasons, it would be advisable to save all the juicy details for the main body of the article.  Suffice to say, it is all about MySQL’s GROUP_CONCAT function and, more importantly, how it just might change how you work with query result sets forever - especially if your database feeds an application.

The Source Database

Like many times before, I’ll be going back to the old faithful Sakila sample database.  A former member of the MySQL AB documentation team named Mike Hillyer created the Sakila database specifically for the purpose of providing a standard schema for use in books, tutorials, and articles just like the one you’re reading.

The database contains a number of related tables themed around the film industry and covers everything from actors and film studios to video rental stores.  The full schema structure can be viewed on the MySQL Dev site.

Please refer to my Generating Reports on MySQL Data tutorial for instructions on downloading and installing the Sakila database.

Listing Group Information the Old Way

The GROUP BY clause is a great tool for generating data about related data.  However, it is not so good for listing details about the grouped data.  To illustrate, consider a scenario where we own a movie rental store and would like to reward our customers who have rented numerous horror movies.  For the sake of argument, let’s say that we want to see which movies each customer rented.  One way to do that is to move the GROUP BY SELECT statement into a subquery that returns the IDs of the customers who fit the bill.  We can then limit the outer query’s results to those customers whose IDs are part of the inner result set. 

Here is the SQL that would do the job:

SELECT CONCAT(CU.last_name, ', ', CU.first_name) AS customer,
        A.phone, 
        F.title, 
        date(R.rental_date) AS rental_date
 FROM sakila.rental R 
     LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
     LEFT JOIN sakila.film F ON I.film_id = F.film_id 
     LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
     LEFT JOIN sakila.category C ON FC.category_id = C.category_id 
     LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
     LEFT JOIN sakila.address A ON CU.address_id = A.address_id
 WHERE CU.customer_id in 
       (SELECT CU.customer_id
        FROM rental R
        LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
        LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
        LEFT JOIN sakila.film F ON I.film_id = F.film_id 
        LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id
        LEFT JOIN sakila.category C ON FC.category_id = C.category_id 
        WHERE C.name = "Horror"
        GROUP BY CU.customer_id
        HAVING COUNT(CU.customer_id) >= 3)
 AND C.name = "Horror"
 ORDER BY customer, title, rental_date DESC;

And here are the first three customers returned, along with their rentals and rental dates:

customer         phone         title                 rental_date
 ----------------------------------------------------------------
 ADAM, NATHANIEL  111177206479  ANALYZE HOOSIERS      2005-08-19
 ADAM, NATHANIEL  111177206479  FREDDY STORM          2005-08-22
 ADAM, NATHANIEL  111177206479  STRANGERS GRAFFITI    2005-08-23
 ANDREW, JOSE     961370847344  EGYPT TENENBAUMS      2005-07-31
 ANDREW, JOSE     961370847344  FIDELITY DEVIL        2005-05-30
 ANDREW, JOSE     961370847344  HIGH ENCINO           2005-07-07
 ANDREW, JOSE     961370847344  LOLA AGENT            2005-08-02
 AQUINO, OSCAR    474047727727  AFFAIR PREJUDICE      2005-07-28
 AQUINO, OSCAR    474047727727  DRUMS DYNAMITE        2005-06-20
 AQUINO, OSCAR    474047727727  EGYPT TENENBAUMS      2005-07-28
 AQUINO, OSCAR    474047727727  STREETCAR INTENTIONS  2005-08-01
 etc...

Works like a charm, even though the inner and outer SQL WHERE clauses are repetitive.  Putting that concern aside, perhaps a bigger issue is that an application that reads the result set has to track the customer names so that it knows when it’s time to move on to the next one.  I’ve done it many times and it always feels a bit kludgy.

Listing Group Information using the GROUP_CONCAT Function

The GROUP_CONCAT function is not new, even though it may be to you.  It concatenates all of the non-NULL values from a group and returns them within a comma-delimited string.  When utilized in conjunction with the GROUP BY clause, it permits us to include the group data in one row.

Observe the following rewrite of the last example, using the GROUP_CONCAT function:

SELECT CONCAT(CU.last_name, ', ', CU.first_name) AS customer,
        A.phone, 
        date(R.rental_date) AS rental_date,
        GROUP_CONCAT(F.title) AS titles,
        COUNT(*) AS rentals_count
 FROM sakila.rental R 
       LEFT JOIN sakila.inventory I ON R.inventory_id = I.inventory_id 
       LEFT JOIN sakila.film F ON I.film_id = F.film_id 
       LEFT JOIN sakila.film_category FC on F.film_id = FC.film_id 
       LEFT JOIN sakila.category C ON FC.category_id = C.category_id 
       LEFT JOIN sakila.customer CU ON R.customer_id = CU.customer_id
       LEFT JOIN sakila.address A ON CU.address_id = A.address_id
 WHERE C.name = "Horror" 
 GROUP BY R.customer_id
 HAVING rentals_count >= 3
 ORDER BY customer, title, rental_date DESC;

You can already see that the redundancy problem is solved because we no longer have to filter down our results.

The movie rentals are listed in the titles column below:

customer         phone         rental_date  titles                                                                 rentals_count                
 --------------------------------------------------------------------------------------------------------------------------------
 ADAM, NATHANIEL  111177206479  2005-08-22   FREDDY STORM,ANALYZE HOOSIERS,STRANGERS GRAFFITI                       3
 ANDREW, JOSE     961370847344  2005-07-31   EGYPT TENENBAUMS,LOLA AGENT,FIDELITY DEVIL,HIGH ENCINO                 4
 AQUINO, OSCAR    474047727727  2005-07-28   EGYPT TENENBAUMS,AFFAIR PREJUDICE,STREETCAR INTENTIONS,DRUMS DYNAMITE  4
 ARTIS, CARL      20064292617   2005-08-18   BOWFINGER GABLES,RULES HUMAN,YENTL IDAHO,FIDELITY DEVIL                4
 BARBEE, CLAYTON  380077794770  2005-05-26   BEHAVIOR RUNAWAY,LOVE SUICIDES,SWARM GOLD                              3
 etc...

The second problem is solved by displaying all of the group data in a single row.  From an application perspective, this offers the advantage of being able to access all of the group data in one operation.  In fact, it’s a fairly simple process, using a string splitting function, which most languages implement.  For example, PHP’s version is called “explode”.   You just pass in the delimiter and string; it returns the data as an array.  Here’s how we would retrieve the titles using the above query:

//fetch the result set
 $res=$mysqli--->query($select_statement);
 //iterate over each row
 while ($row = $res->fetch_array(MYSQLI_ASSOC)) {
   //this statement splits the titles string on
   //the comma separator into an array
   $titles_array = explode(',', $row['titles']);
   //work with the titles array...
 }

Another benefit to using the GROUP_CONCAT function is that the string value may be employed directly as part of an IN statement:

$res_films = $mysqli->query("SELECT * FROM sakila.film WHERE title = IN ($titles_array)");
 // work with $res_films...

Conclusion

Don’t want to use commas as delimiters?  Would you like to sort the items?  The GROUP_CONCAT function has options for both.  We won’t be getting into those topics today, but you can read up on the full syntax in the docs.

See all articles by Rob Gravelle



MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MySQL Forum
Topic By Replies Updated
MySQL in high availability and transction secure banking application klamor 1 August 28th, 10:24 AM
MySQL rollback UAL225 1 August 28th, 10:15 AM
Browsing a DB file that uses MySql finleytech 1 July 26th, 10:51 AM
php cookie won't pass variable to next page ITdevGirl 0 June 22nd, 12:13 PM