Alright, you can get your data back out now. Let’s look at some ways to make your DBMS do some extra work for you by sorting, grouping and narrowing down choices.
Some more about WHERE
In Part 4 you learned about using WHERE to filter your data. What if you have a much more complex set of criteria that you would like to apply to filter? Well, you’re in luck. SQL gives you quite a lot of flexibility in defining how you filter your data.
What if I have more than one criteria? You can use AND and OR. Both AND and OR do exactly what you might think, they allow you to string as many conditions together as you like. Here’s and example of both:
SELECT *
FROM table
WHERE first_name='John' AND
(last_name='Lennon' OR last_name='Mellencamp');
So, how does that work? It simply states that we are looking for anyone with the first name of John and the last name of Lennon or Mellencamp. They must have the first name of John but can have either the last name of Lennon or Mellencamp.
The parentheses not only help you to visually see how things are grouped together but they also let the DBMS know what to do. Imagine if you moved the parentheses around to look something like this:
SELECT *
FROM table
WHERE (first_name='John' AND last_name='Lennon')
OR last_name='Mellencamp';
Now what will the DBMS look for? In this case the DBMS will send back anyone named John Lennon or anyone with the last name of Mellencamp. Herman Mellencamp would be a valid name using this criteria but would not be valid in the first example.
How about this scenario. You want to find the data on John Mellencamp but you can’t remember how to spell his last name. (Just play along …) You do remember that the last 4 characters are “camp”, though. How are you going to find him?
Well, SQL has included several wildcards to help you out. Different wildcards tell the DBMS to do different things. And just to make things more confusing, different DBMS’s use different characters for their wildcards, so be sure to check your DBMS before you start throwing wildcards around.
In our scenario above this is how the wildcard would work:
SELECT *
FROM table
WHERE last_name LIKE '%camp';
Hey! What’s that LIKE doing there? The LIKE lets the DBMS know that we won’t be doing a strict comparison like = or < or > but we will be using wildcards in our comparison.
So, can I use LIKE and = in the same WHERE clause? You sure can just as long as you don’t try to combine them into a single comparison like last_name = LIKE ‘%camp’.
Something like this would be valid, though:
WHERE first_name='John' AND
last_name LIKE '%camp'
Now, back to the wildcard itself. The wildcard tells the DBMS to look for anyone that has “camp” as the last 4 characters in their last name. The % says everything before the last 4 characters is irrelevant. So, what if you moved that % after the “p”?
WHERE last_name LIKE 'camp%'
It will return all last names beginning with “camp” which probably will be none since we never capitalized the “c”. Remember, searching for a string (set of alphanumeric characters) is case sensitive. If you are ever unsure of the case of something you may want to convert everything to upper or lower case before you make your comparisons. I’ll show you how to do this later in this series.
Alright, how about something like this now:
WHERE last_name LIKE '%camp%'
This will search for anyone with the characters “camp” anywhere in the last name. It will return all of these examples: “camp”, “campbell” and “mocamp”.
As you can see, it is very important where you place your wildcards. It is also important that you don’t overuse the wildcards. Wildcards can easily get out of control if you overuse them.
There are other wildcards as well like the brackets [] and the underscore _. These wildcards are a bit more complex and beyond the scope of this lesson but we may come back to them later.
What about Sorting?
What if I want to sort my data? Do I have to write some customized sorting script to sort the data after I get it out?
Thankfully, no. SQL has already done that work for you. Here’s an example of how you would retrieve and sort all of the last names in our “contacts” table that we created way back in Parts 1 and 2:
SELECT *
FROM table
ORDER BY last_name;
How’s that for easy? The ORDER BY clause tells the DBMS to perform a sort on the data using the “last_name” column. Now the data that you get out will be pre-sorted by last name from A to Z.
So, what if I wanted it to be sorted in reverse order from Z to A? That’s easy too:
SELECT *
FROM table
ORDER BY last_name DESC;
In this case the DESC is used to tell the DBMS to sort by last name in descending order. By default a DBMS will sort in ascending order and you don’t really need to specify it. If you are like me, though, and you want to make it abundantly clear to anyone reading your code what is going on you can use ASC in your ORDER BY clause like this:
ORDER BY last_name ASC
What if I want to sort by more than one column? Can it be done?
Yep. To search by more than one column you list the columns you want to be sorted in the order that you want them sorted. If you wanted to sort by last name and then first name, for example, it would look something like this:
ORDER BY last_name, first_name ASC
Sorting in SQL is straight forward and very very very handy. You will probably find yourself using ORDER BY frequently.
What about Grouping?
Imagine this scenario, you are running a gift catalog business for corporations and you want to know how many items have been sold to each corporate client you have. To accomplish this task you will need to learn about COUNT and GROUP BY.
Let’s take a look at COUNT first. COUNT keeps a running total of the number of records retrieved. As a general rule, you will usually put any count totals into a temporary column. Here’s an example of how all of that might work:
SELECT COUNT(*) AS sales_total
FROM sales;
So, what you end up with here is the total number of all of the rows on the “sales” table which will be stored in the temporary column of “sales_total”. Remember, “sales_total” is not a permanent part of your table but is rather a temporary column that you can refer to to get the count number from. You can use whatever name you like as your temporary column name. This may not make much sense with this example but it will be more clear when we do the grouping in a minute.
Now, let’s do a little grouping:
SELECT company_name,
COUNT(*) AS sales_by_company_total
FROM sales
GROUP BY company_name;
What this will do is search the “sales” table and count the total number of rows grouped by the company name which is what we want. The GROUP BY tells the DBMS to group everything by company name. The COUNT then keeps a running total in the temporary column “sales_by_company_total” in accordance with the GROUP BY clause. Notice I changed “sales_total” to “sales_by_company_total” so that it would be a bit more descriptive. I also added the “company_name” column in the table to the SELECT statement so that our output will have both the company’s name and total sales. The output would look something like this:
company_name sales_by_company_total
-----------------------------------------
Amazon.com 478
Sears 222
CompUSA 512
Dollar General Stores 6
So, what if I wanted to sort the grouped output?
Not a problem, just add your ORDER BY clause at the end like this:
SELECT company_name,
COUNT(*) AS sales_by_company_total
FROM sales
GROUP BY company_name
ORDER BY company_name ASC;
Grouping data can be almost as handy as sorting but you probably won’t find as many occasions to use it. Knowing how to use the built in functions of SQL can save you a great deal of time, effort and frustration. So, before you try writing any custom scripting be sure to thoroughly check out all of the functions available with your DBMS. You will probably be surprised how much has already been done for you.