Have you ever wanted to return the last 100 rows of a given set of records? What about the second set of 100 records? Or possibly, you have wanted to return the first so many billing records for each billing month. This article will discuss how to use the TOP clause to help solve those requests where you want to restrict the number of records returned based on a record count. But first I will discuss the new features incorporated into the TOP clause with SQL Server 2005.
New Features in TOP Clause Implemented with SQL Server 2005
SQL Server 2005 brought a couple of new features to the TOP clause. The syntax of the TOP clause in SQL Server 2005 now looks like this:
TOP (expression) [PERCENT] [ WITH TIES ]
In SQL Server 2005 you are allowed to specify the number of rows you want to return with an expression, where as in prior versions you where only allowed to hard code a number. Being able to specify a variable allows you to more easily control the number of rows returned programmatically.
The second new feature is the “WITH TIES” options. This option instructs SQL Server to bring back additional rows should the last row have the same order by value as the next row. This option is only valid in a SELECT statement that contains an ORDER BY clause.
To demonstrate how to use these new TOP clause features let me go through a couple of examples.
Sample Data for Examples
Prior to going through my examples, I will need a table with some test data. Below is my sample data and a script to populate my sample TopDemo table:
PaymentID |
Amount |
PayDate |
1 |
12.23 |
1/2/2006 |
2 |
73.18 |
1/10/2006 |
3 |
92.20 |
1/21/2006 |
4 |
62.20 |
1/21/2006 |
5 |
12.45 |
1/23/2006 |
6 |
10.99 |
2/4/2006 |
7 |
34.23 |
2/7/2006 |
8 |
34.87 |
2/12/2006 |
9 |
58.32 |
2/15/2006 |
10 |
34.23 |
2/23/2006 |
11 |
98.26 |
3/1/2006 |
12 |
45.32 |
3/15/2006 |
13 |
47.30 |
3/16/2006 |
14 |
35.21 |
3/24/2006 |
15 |
57.23 |
3/26/2006 |
16 |
24.56 |
3/30/2006 |
17 |
34.23 |
4/4/2006 |
18 |
23.99 |
4/6/2006 |
19 |
56.32 |
4/8/2006 |
20 |
47.53 |
6/25/2006 |
21 |
30.56 |
7/7/2006 |
22 |
65.34 |
7/8/2006 |
23 |
44.55 |
8/4/2006 |
24 |
84.23 |
8/8/2006 |
25 |
23.56 |
8/22/2006 |
26 |
45.77 |
9/12/2006 |
27 |
65.90 |
9/25/2006 |
set nocount on create table TopDemo ( PaymentID int identity, Amount money, PayDate DateTime ) insert into TopDemo(Amount, PayDate) values (12.23,'1/2/2006') insert into TopDemo(Amount, PayDate) values (73.18,'1/10/2006') insert into TopDemo(Amount, PayDate) values (92.20,'1/21/2006') insert into TopDemo(Amount, PayDate) values (62.20,'1/21/2006') insert into TopDemo(Amount, PayDate) values (12.45,'1/23/2006') insert into TopDemo(Amount, PayDate) values (10.99,'2/4/2006') insert into TopDemo(Amount, PayDate) values (34.23,'2/7/2006') insert into TopDemo(Amount, PayDate) values (34.87,'2/12/2006') insert into TopDemo(Amount, PayDate) values (58.32,'2/15/2006') insert into TopDemo(Amount, PayDate) values (34.23,'2/23/2006') insert into TopDemo(Amount, PayDate) values (98.26,'3/1/2006') insert into TopDemo(Amount, PayDate) values (45.32,'3/15/2006') insert into TopDemo(Amount, PayDate) values (47.30,'3/16/2006') insert into TopDemo(Amount, PayDate) values (35.21,'3/24/2006') insert into TopDemo(Amount, PayDate) values (57.23,'3/26/2006') insert into TopDemo(Amount, PayDate) values (24.56,'3/30/2006') insert into TopDemo(Amount, PayDate) values (34.23,'4/4/2006') insert into TopDemo(Amount, PayDate) values (23.99,'4/6/2006') insert into TopDemo(Amount, PayDate) values (56.32,'4/8/2006') insert into TopDemo(Amount, PayDate) values (47.53,'6/25/2006') insert into TopDemo(Amount, PayDate) values (30.56,'7/7/2006') insert into TopDemo(Amount, PayDate) values (65.34,'7/8/2006') insert into TopDemo(Amount, PayDate) values (44.55,'8/4/2006') insert into TopDemo(Amount, PayDate) values (84.23,'8/8/2006') insert into TopDemo(Amount, PayDate) values (23.56,'8/22/2006') insert into TopDemo(Amount, PayDate) values (45.77,'9/12/2006') insert into TopDemo(Amount, PayDate) values (65.90,'9/25/2006')
Using an EXPRESSION in the TOP Clause
SQL Server 2005 makes it easier to programmatically control the number of records you want to return when you use the TOP clause. With these changes to the TOP clause, you can now use a variable to identify the number of rows to return. Below is an example that returns the top 3 records from my TopDemo table:
declare @top int set @top = 3 select top (@top) * from TopDemo
Here I defined an integer variable @top and set its value to 3. I then used that variable in the SELECT statement immediately following the TOP clause and enclosed it in parentheses. When the SQL Server 2005 engine executes this command, it evaluates the value of the variable @top to determine the number of rows to return. In my example, I return the following 3 records:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 1 12.23 2006-01-02 00:00:00.000 2 73.18 2006-01-10 00:00:00.000 3 92.20 2006-01-21 00:00:00.000
Now if you have some code that determines the number of records you want to return, it is a simple matter of just getting that record number into a variable, then placing the variable into a TOP clause to constrain your returned record set to a specific number of records. Being able to dynamically control the number of rows returned in SQL Server 2005 means you no longer have to using dynamic SQL to accomplish this task.
Using the WITH TIES option
SQL Server 2005 also offers a WITH TIES option. This option allows you to specify that you want to return all the records that have the same value as last records returned by the TOP clause, based on the ORDER BY clause. Here is an example that demonstrates the WITH TIES option:
select top 3 with ties * from TopDemo order by PayDate
Here I am specifying that I want to return the top 3 records and any additional records that have the same PayDate as the third record. When I run this query this is my result set:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 1 12.23 2006-01-02 00:00:00.000 2 73.18 2006-01-10 00:00:00.000 3 92.20 2006-01-21 00:00:00.000 4 62.20 2006-01-21 00:00:00.000
As you can see, 4 records were returned. This is because the third and forth record have the same PayDate. So now, if you want to return all the records that have the same ORDER BY value, but still what to loosely constrain your returned set to a number of records, then the WITH TIES option should help.
Returning the Second Set of 4 Records
Some times, you don’t want to return the top set of so many records, but you want to return the next set of so many records. Here is an example that returns the second set of 4 records:
select top 4 * from TopDemo a where PaymentID in (select top 8 PaymentID from TopDemo) order by PaymentID desc
In order to return the second set of 4 records I used a correlated subquery (for more information on correlated subqueries click here). The correlated subquery returns “PaymentID” column values for the top 8 records. I then return only the first 4 matching TopDemo records that have one of the 8 PaymentID’s returned from the subquery. Since I requested my final set to be to be sorted in descending order by the PaymentID I will get the second set of 4 records. Here is the result set from the above query:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 8 34.87 2006-02-12 00:00:00.000 7 34.23 2006-02-07 00:00:00.000 6 10.99 2006-02-04 00:00:00.000 5 12.45 2006-01-23 00:00:00.000
As you can see, this set is ordered in descending sequence by PaymentID. If you really wanted the payments to be sorted sequentially by PaymentID, starting with the lowest PaymentID first, then you would have to run the following T-SQL statement, which sorts the final results set in ascending order by PaymentID:
select * from (select top 4 * from TopDemo a where PaymentID in (select top 8 PaymentID from TopDemo) order by PaymentID desc) a order by PaymentID
Here I just placed an alias named “a” around my original query’s result set, so I could re-sort my four records into PaymentID order.
How to get the BOTTOM 5 records
SQL Server does not supply a BOTTOM clause, but that doesn’t mean you can’t easily get the BOTTOM so many records. It is relatively simple to use the TOP clause to retrieve the bottom “x” number of records. Here is an example where I retrieved the bottom 5 records based on PayDate from my TopDemo table.
select top 5 * from TopDemo order by PayDate desc
In order to return the bottom five records all I had to do was specify the descending clause on my order by statement. This returned the bottom 5 records based on PayDate, although the final record set returned is ordered in descending order by PayDate. Below is the result set returned when I run the above query:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 27 65.90 2006-09-25 00:00:00.000 26 45.77 2006-09-12 00:00:00.000 25 23.56 2006-08-22 00:00:00.000 24 84.23 2006-08-08 00:00:00.000 23 44.55 2006-08-04 00:00:00.000
Now normally you probably would want to return the bottom so many records and have those records be ordered based on a key in ascending order, instead of having them sort like my above results set where the PayDate is in descending order. This can be solved by wrapping another query around the above query, like so:
select * from (select top 5 * from TopDemo order by PayDate desc) a order by PayDate
The above query will order my final result set in ascending PayDate order. Below is the result set of the above query:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 23 44.55 2006-08-04 00:00:00.000 24 84.23 2006-08-08 00:00:00.000 25 23.56 2006-08-22 00:00:00.000 26 45.77 2006-09-12 00:00:00.000 27 65.90 2006-09-25 00:00:00.000
How to get the First Record of Each Month
For my last example, I will show you how you can use the TOP clause to return the first TopDemo payment record for each PayDate month. To accomplish this I will once again use a correlated subquery to identify the top payment record for each month, as the code below demonstrates:
select * from TopDemo a where PayDate = (select top 1 PayDate from TopDemo where month(PayDate) = Month(a.PayDate) order by PayDate) order by PayDate
Here the inner query (“select top 1…..”) a correlated subquery takes the month identification from the outer query to select the first PayDate for the that specific month. The first PayDate of the month is then used in the WHERE clause to return the first payment for each month. Below is the result set from the above correlated subquery:
PaymentID Amount PayDate ----------- --------------------- ----------------------- 1 12.23 2006-01-02 00:00:00.000 6 10.99 2006-02-04 00:00:00.000 11 98.26 2006-03-01 00:00:00.000 17 34.23 2006-04-04 00:00:00.000 20 47.53 2006-06-25 00:00:00.000 21 30.56 2006-07-07 00:00:00.000 23 44.55 2006-08-04 00:00:00.000 26 45.77 2006-09-12 00:00:00.000
In my T-SQL code above, the inner query returns 8 different PayDate’s, one for the first payment date of each month that existed in my TopDemo table. These unique payment dates were then used in the outer query to return any TopDemo records that had one of those 8 payment dates.
Conclusion
The TOP clause is useful in helping you constrain your set to specific number of records. The new SQL Server 2005 features of the TOP clause now provides you with some alternatives for how to write your code for some specific data retrieve criteria. Being able to dynamically control the number of records returned is a big help when you don’t know exactly how many records you want to return. Now you can use the WITH TIES option as an easy method to return those records that have the same value as the last record in your TOP criteria.