Analysis
Hopefully you spent a few minutes trying to figure this one out. It definitely took me some time over
a couple days with a good nights sleep in between.
There is nothing wrong with the two statements syntactically. I verified this by breaking down the
queries and executing parts of them at a time. First I ran the following:
select sum(oi.price)
from OrdLineDtl od, OrderLine ol
where od.TypID = 53
and od.AttrVal = 100
and od.OrdLineID = ol.OrdLineID
This worked fine. I then verified that the Atttr table has an entry with 53 and it contains
a value of 'Item'. All good so far.
At this point, I was definitely a little stumped. Why would one query work fine and
the other one fail. They both are looking to achieve the same goal. It was then that
I decided to check the query plans. I have been a writing a book for the SQL Server 2000
certification exam and one of the chapters I had just finished was on performance. So I
examined the execution plans for both queries. The query with the error (the first one)
did not produce an execution plan, so I settled for getting the plan for the second query.
Here it is:
In this query, the first thing that happens is the OrdLineDtl table is scanned for all rows with
a value of 53. This result is then filtered based on the AttrVal = 100. Next the join
occurs and the aggregate is computed, etc. So I started to think about why the first
query would not follow the same plan.
After a cup of coffee, a few walks around the office, and a lot of staring at the screen,
it finally hit me. I changed the query slightly to test my hypothesis to the following:
declare @this_id int
set @this_id = 100
select sum(ol.price)
from OrdLineDtl od, OrdLine ol
where od.TypID = ( select TypID
from Attr
where AttrName = 'Item'
)
and od.AttrVal = cast( @this_id as varchar( 10))
and od.OrdLineID = ol.OrdLineID
Shazam!!!! It worked just fine.
The problem that I was encountering was that the AttrVal field. I had initially
assumed (incorrectly) that the difference in the two queries (the subquery) was the problem.
In fact, the issue occurred because of the query plan that SQL Server was choosing. For the
query above, SQL Server will first evaluate the subquery and execute an ASSERT in the
query plan which results in 53. At this point in the execution plan, the OrdLineDtl
table is scanned for rows which match the @this_id value. A number of rows match
this qualification, though not all of them are numerical values. As this matching occurs, an
implicit conversion from the AttrVal varchar values to numeric values occurs. For the
rows that cannot be converted, an error occurs and the query ends.
We fixed this query by using a stored procedure to calculate the 53 value and then include that in
the query as a variable. This way, we still have a dynamic system that can handle
the Attr table changing ids (such as a delete of the value and then an insert).
Conclusions
I have never encountered this situation before and it took me more time than I would have
thought to find the problem. It was interesting to me because the problem was not the
difference between the two queries (the obvious choice) and instead was because of the
query plans chosen by SQL Server.
As always, I welcome feedback and hope that this may help someone in the future
when they have a problem with a subquery.
Steve Jones
March 2001