MySQL SELECT Statement 101
August 27, 2010
The SELECT statement is without question the most complex amongst MySQL's data manipulation statements. This article covers the syntax of the SELECT statement, exploring its many optional clauses.
The SELECT statement is without question the most complex amongst MySQLs data manipulation statements. Not surprising when you consider that the foremost purpose of structured query language (SQL) is to retrieve information from a relational database which adheres to a given criteria. Entire books have been written about how to construct a query to fetch the data that youre after. Thats not the purpose of this article. Our goal will be to cover the syntax of the SELECT statement and gain an understanding of its many optional clauses. Well start at the beginning and work our way through them. When were done, youll understand which clauses to use for different purposes. Well leave the twenty table joins for another day.
Heres some code to create and populate a couple of tables. If youve been following my data manipulation statements series thus far, youll recognize them as the same ones that weve used previously.
# # Table structure for table 'employees' # CREATE TABLE /*!32312 IF NOT EXISTS*/ "employees" ( "id" int(11) NOT NULL AUTO_INCREMENT, "shop_id" int(11) NOT NULL DEFAULT '0', "gender" enum('m','f') NOT NULL, "name" varchar(32) NOT NULL, "salary" int(11) DEFAULT NULL, PRIMARY KEY ("id","shop_id"), KEY "Foreign Key" ("shop_id") ) AUTO_INCREMENT=15; # # Data for table 'employees' # LOCK TABLES "employees" WRITE; /*!40000 ALTER TABLE "employees" DISABLE KEYS;*/ INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") VALUES (1,1,'m','Jon Simpson',4500); INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") VALUES (2,1,'f','Barbara Breitenmoser',NULL); INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") VALUES (3,2,'f','Kirsten Ruegg',5600); INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") VALUES (4,3,'m','Ralph Teller',5100); INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") VALUES (5,3,'m','Peter Jonson',5200); /*!40000 ALTER TABLE "employees" ENABLE KEYS;*/ UNLOCK TABLES; # # Table structure for table 'shops' # CREATE TABLE /*!32312 IF NOT EXISTS*/ "shops" ( "shop_id" int(11) NOT NULL AUTO_INCREMENT, "shop" varchar(32) DEFAULT NULL, PRIMARY KEY ("shop_id") ) AUTO_INCREMENT=4; # # Data for table 'shops' # LOCK TABLES "shops" WRITE; /*!40000 ALTER TABLE "shops" DISABLE KEYS;*/ INSERT INTO "shops" ("shop_id", "shop") VALUES (1,'Zurich'); INSERT INTO "shops" ("shop_id", "shop") VALUES (2,'New York'); INSERT INTO "shops" ("shop_id", "shop") VALUES (3,'London'); /*!40000 ALTER TABLE "shops" ENABLE KEYS;*/ UNLOCK TABLES;
The Easiest Query You Can Write
SELECT 'Easy!'; Outputs Easy!
Were you expecting more? At its core, the SELECT outputs whatever comes after it, much like the old C/C++ printf function. Although its usually used to extract data from one or more tables, it also accepts expressions. It just so happens that literal values are a type of expression. Similarly, numeric values and null are equally valid.
SELECT 4; Outputs 4
SELECT null; Outputs (null)
Another type of expression is the output of one or more functions. By applying both built-in and custom stored functions to a value, you can manipulate the data to suit your desired specifications. The following combines the CURDATE() and DATE_FORMAT() functions to output the weekday, month name, and year of the current date:
SELECT null; Outputs (null)
Outputs something like:
Even the MySQL documentation defines the expression (select_expr) as indicating a column that you want to retrieve. As well see later on, there are times that you want to specify a value that is not in a table column.
As our first example illustrates, expressions are returned with the expression code in the column header. This helps to identify what produced the returned values. Sometimes, you may want something shorter instead. You can rename any column to a value of your choosing using column aliasing. Just append the AS 'column name' after the expression:
SELECT expression AS 'column name';
Note that the quotes around the column alias are not required for single words, but it is a good practice to always use them.
Here is the same query as above with an alias:
SELECT DATE_FORMAT(CURDATE(), '%W %M %Y') AS 'Today\'s Date';
The presence of the apostrophe in the name complicates matters ever-so-slightly because it uses the same ASCII value as the single quote. There are two ways to handle this. One is to escape it by including a backslash (\) before it, as I did here. The other is to use double quotes around the alias name instead of single ones; both single and double quotes are valid string delimiters.
The Standard Query
Our second form of the SELECT statement is probably more familiar to you. It retrieves data from one or more table columns. The optional WHERE clause can be added to filter the results based on a given criteria set.
SELECT expression [, select_expr ...] FROM table_reference [, table_reference ...] [WHERE where_condition];
Without the WHERE condition, the query will return every row in the table. The usefulness of such a query is to reorder table columns and/or remove some non-pertinent ones from the resultset. Heres a query that omits the ID fields as well as reordering the name and gender columns. There are also a couple of function calls to format the salary as currency.
SELECT name, gender, CONCAT('$', FORMAT(salary, 2)) AS 'Salary' FROM employees;
Here is the resulting record set:
Querying from Multiple Tables
The real power of the SELECT statement becomes apparent when you combine tables in a meaningful way. When data is fully normalized, it can be difficult to see how tables relate to each other. Thats where table joins come in. By linking tables by related columns, we can gain a clearer picture of their relationship. Say we wanted to know which employees worked in the Zurich shop. The way to link the tables would be via the common shop_id field. Additional filtering criteria come after the AND. The reason that I say additional is because joining tables in this way only returns rows where both tables contain the same values. If either table contains a shop_id that does not link to the other table, these orphaned rows do not appear in the result set. There are other types of joins that can be used to get around this. There is an excellent article on join types on the developer.com website. The MySQL Docs also contain useful information about table joins.
Here is the code to list employees who work in the Zurich shop:
SELECT name, gender, CONCAT('$', FORMAT(salary, 2)) AS 'Salary' FROM employees, shops WHERE employees.shop_id = shops.shop_id AND shops.shop = 'Zurich';
The above query produces the following two records:
In the above query, we matched the entire shop value against a single, specific string. You can cast a wider net by combining comparison operators and/or functions. Here are some of the more common operators. For a full list, see the MySQL documentation:
SELECT name, gender, CONCAT('$', FORMAT(salary, 2)) AS 'Salary' FROM employees, shops WHERE employees.shop_id = shops.shop_id AND shops.shop Like 'Z*' AND shops.shop Not Like 'Za*' AND shops.shop Not Like 'Ze*' AND shops.shop IS NOT NULL;
That concludes our look at the basic syntax of the SELECT statement. Next time, Ill be sharing words of warning regarding the dangers of AND/OR combining, as well as the concept called operator precedence.