dcsimg

How Operator Precedence Affects MySQL SELECT Queries

September 22, 2010

When joining tables on common fields, if you're not diligent in the use of AND-OR combinations in the criteria clause, be prepared for unexpected results! Learn how to break filtering criteria into individual steps and how to group AND-OR statements to most effectively isolate the data that you wish to retrieve.

The following code can be used to create and populate the employees and shops tables. If you’ve downloaded the table data from my recent articles, you don’t need these as they are the same:

#
# Table structure for table 'employees'
#
CREATE TABLE /*!32312 IF NOT EXISTS*/ "employees" (
  "id" int(11) NOT NULL AUTO_INCREMENT,
  "shop_id" int(11) NOT NULL DEFAULT '0',
  "gender" enum('m','f') NOT NULL,
  "name" varchar(32) NOT NULL,
  "salary" int(11) DEFAULT NULL,
  PRIMARY KEY ("id","shop_id"),
  KEY "Foreign Key" ("shop_id")
) AUTO_INCREMENT=15;
#
# Data for table 'employees'
#
LOCK TABLES "employees" WRITE;
/*!40000 ALTER TABLE "employees" DISABLE KEYS;*/
INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") 
    VALUES (1,1,'m','Jon Simpson',4500);
INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") 
    VALUES (2,1,'f','Barbara Breitenmoser',NULL);
INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") 
    VALUES (3,2,'f','Kirsten Ruegg',5600);
INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") 
    VALUES (4,3,'m','Ralph Teller',5100);
INSERT INTO "employees" ("id", "shop_id", "gender", "name", "salary") 
    VALUES (5,3,'m','Peter Jonson',5200);
/*!40000 ALTER TABLE "employees" ENABLE KEYS;*/
UNLOCK TABLES;
#
# Table structure for table 'shops'
#
CREATE TABLE /*!32312 IF NOT EXISTS*/ "shops" (
  "shop_id" int(11) NOT NULL AUTO_INCREMENT,
  "shop" varchar(32) DEFAULT NULL,
  PRIMARY KEY ("shop_id")
) AUTO_INCREMENT=4;

#
# Data for table 'shops'
#
LOCK TABLES "shops" WRITE;
/*!40000 ALTER TABLE "shops" DISABLE KEYS;*/
INSERT INTO "shops" ("shop_id", "shop") 
    VALUES (1,'Zurich');
INSERT INTO "shops" ("shop_id", "shop") 
    VALUES (2,'New York');
INSERT INTO "shops" ("shop_id", "shop") 
    VALUES (3,'London');
/*!40000 ALTER TABLE "shops" ENABLE KEYS;*/
UNLOCK TABLES;

Here’s a simple enough query that attempts to seek employees who work in Zurich or have a monthly salary in excess of $5,500:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Monthly Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich' 
OR     employees.salary > 5500; 

This result set is probably not what the query writer had in mind (notice the three rows of ‘Kristen Ruegg’):

name

gender

Monthly Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

m

(NULL)

Kirsten Ruegg

f

$5,600.00

Kirsten Ruegg

f

$5,600.00

Kirsten Ruegg

f

$5,600.00

So, what went wrong?

It all comes down to something called Operator Precedence, which is a concept that applies to any discipline that utilizes operators; the two main ones being mathematics and computer languages. Precedence rules, also known as the order of operations, are used to clarify which procedures should be performed first. In math for instance, multiplication and division take precedence over addition and subtraction. Hence, the expression 10 + 100 / 5 equals 30 and not 22. The same idea applies to SQL, where precedence rules govern which operator will be applied in a given expression before the others. Here is the operator list for MySQL, from first to last:

INTERVAL
BINARY, COLLATE
!
- (unary minus), ~ (unary bit inversion)
^
*, /, DIV, %, MOD
-, +
<<, >>
&
|
=, <=>, >=, >, <=, <, <>, !=, IS, LIKE, REGEXP, IN
BETWEEN, CASE, WHEN, THEN, ELSE
NOT
&&, AND
XOR
||, OR
:=

Please before Thank You, AND Before OR

While both are quite low on the precedence scale, the AND does come just before the Exclusive OR (XOR) and OR operators. So what does that mean for our query above? To answer that question, let’s take a look at the statement again, but with the implied parentheses:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Monthly Salary' 
FROM   employees,
       shops
WHERE (employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich') 
OR     employees.salary > 5500; 

In last week’s Identifying and Eliminating the Dreaded Cartesian Product article, we saw how improper or missing table joins can wreak devastating effects on our results. And that’s exactly what is happening here. The AND clause is sticking to the line above it, causing the join to be limited to it, while leaving the OR part of the statement left standing on its own. After including the implied parentheses around the first two criteria statements, we can see that the query is fetching employees that work in the Zurich office, but it is also retrieving an employee for every row in the shops table whose salary is over $5500. Careful analysis of the data confirms that “Kirsten Ruegg” is in fact the only employee who makes that amount. The fact that she works in New York is irrelevant because the OR statement is not linked to the shops table. That, my friends, is a Cartesian Product.

Linking the OR statement to the shops table removes the Cartesian Product and returns the correct results:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Monthly Salary' 
FROM   employees,
       shops
WHERE (employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich')
OR    (employees.shop_id = shops.shop_id 
AND    employees.salary > 5500; 

name

gender

Monthly Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

m

(NULL)

Kirsten Ruegg

f

$5,600.00

An easier way to express the same criteria statement is to enclose the filtering criteria in parentheses so that it is separate from the table join:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Monthly Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND   (shops.shop = 'Zurich'
OR     employees.salary > 5500); 

Here again are the correct results:

name

gender

Monthly Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

m

(NULL)

Kirsten Ruegg

f

$5,600.00

The Proper Use of Parentheses

The moral of this story is clear: Always Enclose Criteria Selections in Parentheses. In fact, the more filtering criteria that you have, the more important that it becomes to use parentheses to isolate specific criteria, especially when mixing operators of differing precedence. The following SELECT query also includes the IS and NOT operators.

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich'
AND    employees.salary IS NOT NULL
OR     employees.salary > 5500; 

name

gender

Monthly Salary

Jon Simpson

m

$4,500.00

Kirsten Ruegg

f

$5,600.00

Kirsten Ruegg

f

$5,600.00

Kirsten Ruegg

f

$5,600.00

While the above statement does succeed in eliminating the row that contained the NULL salary, it still produces the same Cartesian Product as the original query:

Unfortunately, knowing that we need parentheses is only half the battle. The other half is to know where to put them! Guess wrong, and you’ll retrieve the wrong data. This is where it helps to be really clear on what exactly you want to see. Watch what happens when we include different lines within the parentheses. First, the same AND-OR combination as the previous example:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich'
AND   (employees.salary IS NOT NULL
OR     employees.gender = 'f');

The above statement says:

“Show me the employee info for those who work in the Zurich shop.

Of those, only include employees whose salary is above $5500

OR whose gender is female.”

We can follow the order of operations to see what should come back. To begin, we’ll execute the first statement above to see the employees who work in the Zurich shop. The easiest way to accomplish this is to comment out the lines that we don’t want. There are two kinds of comments to choose from:

A double dash comments out one line: --This is a comment.

/* and */ enclose one or more lines of comments. Here’s the same query as above with the last two lines commented out:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND    shops.shop = 'Zurich'
/* AND   (employees.salary IS NOT NULL
OR     employees.gender = 'f'); */

The query now returns only employees who work in the Zurich office:

name

gender

Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

f

(NULL)

Now we’ll run the same query with the shop filter commented out instead:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
-- AND    shops.shop = 'Zurich'
AND   (employees.salary IS NOT NULL
OR     employees.gender = 'f');

That produces a list of employees who either have any non-null salary value OR are female:

name

gender

Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

f

(NULL)

Kirsten Ruegg

f

$5,600.00

Ralph Teller

m

$5,100.00

Peter Jonson

m

$5,200.00

Since the two filtering criteria are joined by an AND, only records that are found in both record sets show up in the final results:

name

gender

Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

f

(NULL)

The alternative is to include the three filtering criteria within parentheses:

SELECT name, 
       gender, 
       CONCAT('$', FORMAT(salary, 2)) AS 'Salary' 
FROM   employees,
       shops
WHERE  employees.shop_id = shops.shop_id
AND   (shops.shop = 'Zurich'
AND    employees.salary IS NOT NULL
OR     employees.gender = 'f');

Due to the higher operator precedent of the AND keyword, this SELECT statement says:

“Show me the employee info for those who work in the Zurich shop

AND whose salary is any non-null value.

In addition to those, show me employees whose gender is female.”

This causes Kirsten Ruegg to be picked up, as she satisfies the female gender criteria, even though she does not work in the Zurich shop:

name

gender

Salary

Jon Simpson

m

$4,500.00

Barbara Breitenmoser

f

(NULL)

Kirsten Ruegg

f

$5,600.00

As you can see, operator precedence can have subtle but important effects on your SELECT query result sets. It pays to go through the trouble of breaking complex criteria into individual steps and to include parentheses wherever you want to group multiple criteria together.

» See All Articles by Columnist Rob Gravelle








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers