Just SQL Part V - Counting with SQL
September 1, 2005
How many do I have? Do I have at least one? What is the greatest? What is the least? Give me the top 5 rows. These are just some of the types of questions that are often asked when trying to evaluate table data.
Many of the problems I see in answering reporting questions, and everyday DBA work, revolves around being able to sum, count, group, and select rows of information based on these selective functions. Most practitioners can count or sum values just fine but when they must answer questions on how the counting or summing relates to other information they do not quite know where to start. So how do we translate these types of questions into valid SQL statements that can answer many of these counting questions? This article will explore some of the more common questions I have come across in my days and give a clear pattern that can be used to develop your own SQL statements. Some are quite simplistic others are more abstract. Nevertheless, these statements should get you started in your quest to related summed information from your database.
We will again go back to our DOG_ORIGIN table, shown in Table 1, for these examples.
On the surface, the simplest of all counting questions is to determine how many of something there is in a table. In the DOG_ORIGIN table, we might want to ask, "How many dogs are considered 'Big'." To do this we can simply put together the following SQL statement. This will look through the table DOG_ORIGIN and count the number of rows that have the BREED_SIZE of 'Big'. You could put another predicate together and ask how many 'Big' dogs are in the USA, very simple. Most can develop this query; it is after creating this output that confusion sets in.
SQL> SELECT count(*) FROM dog_origin WHERE breed_size = 'Big'; COUNT(*) ---------- 6
To take the previous simple counting example a bit further, we might want to ask for a summing of the number of big dogs by country. This is done by the following SQL. Simply add the country column to the select list and then the 'GROUP BY' clause to show counts of dogs grouped by country. Not too difficult, but we are now beginning to see a relationship develop across the different countries and the number of dogs.
SQL> SELECT country, count(*) FROM dog_origin WHERE breed_size = 'Big' GROUP BY country; COUNTRY COUNT(*) ------------------------------ ---------- Germany 3 Switzerland 2 USA 1
Now let us say we want to know those countries that have big dogs but also have more than one. This could then be done by the following SQL. Everything stays the same as in our previous example but we must add the 'HAVING' keyword. This relates the count you have grouped against the country column and evaluates it to having more than the static number 1. Often the 'HAVING' clause actually will be evaluated to a dynamic variable as will be seen in our next example.
SQL> SELECT country, count(*) FROM dog_origin WHERE breed_size = 'Big' GROUP BY country HAVING count(*) > 1; COUNTRY COUNT(*) ------------------------------ ---------- Germany 3 Switzerland 2
Another question might be: What country has the most big dogs? This is a problem of ranking the number of dogs per country and then picking off the country that ranks number one. There are actually a few different ways to accomplish this somewhat confusing question. Both solutions use the concept of a sub-select statement. This is basically where we have a select statement embedded inside another select statement.
1. The first solution extends the previous solution and relates the 'HAVING count(*)' clause to a 'MAX(count(*))' variable in a sub-select statement. The sub-select statement uses the MAX function with a GROUP BY clause that returns the maximum number of dogs found across the countries. I would encourage you to create the DOG_ORIGIN table with data and run this query separately. You will find out that it returns the value of 3, which is the number of big dogs in Germany.
SQL> SELECT a.country, count(*) FROM dog_origin a WHERE a.breed_size = 'Big' GROUP BY a.country HAVING count(*) = (SELECT MAX(count(*)) FROM dog_origin b where b.breed_size = 'Big' GROUP BY b.country)
2. If we are talking about Oracle, and I usually do, there is a function called 'RANK' that can be used to tackle this type of question. Here we also have implemented a sub-select statement that looks very similar to the ones in above examples. The only difference is the new column 'rank' has been introduced that uses the RANK statement. The rank function basically assigns a ranking to the variable count(*) which is our grouping of big dogs by country. Be careful here, as I have also included a descending order to the ORDER BY clause in the rank column. If I did not, then since Oracle defaults to ascending order, USA would have shown up as the top ranked owner of big dogs, which is what we do not want.
SQL> SELECT country, count_dogs FROM (SELECT country, COUNT(*) count_dogs, RANK() OVER (ORDER BY COUNT(*) DESC) rank FROM dog_origin WHERE breed_size = 'Big' GROUP BY country ) WHERE rank = 1 COUNTRY COUNT_DOGS ------------------------------ ---------- Germany 3
For our last trick here, I often see requests on how to turn a table completely around and display the actual BREED_SIZE into a column and have those counts represented as rows in a table. Other than changing your database model to handle this more effectively, here is a simple method in Oracle that you can use to accomplish this. Please note that even though this works, you must know the possible values in the BREED_SIZE column that you want to turn into columns.
SQL> SELECT country, SUM(DECODE(breed_size,'Big' ,1,0)) Big, SUM(DECODE(breed_size,'Medium',1,0)) Medium, SUM(DECODE(breed_size,'Small' ,1,0)) Small FROM dog_origin GROUP BY country COUNTRY BIG MEDIUM SMALL ------------ ---------- ---------- ---------- Australia 0 1 1 Germany 3 0 0 Switzerland 2 1 0 USA 1 2 0
Pulling the count() of something from tables in Oracle or any other database is not the issue. This is quite easy and is probably one of the very first statements we learn to execute. The problem comes when we need to relate, group, or select in some fancy way this counting to something else in our database. These SQL statements should give you a good base to launch off and begin relating those valuable counts.