Have you ever wanted to
return the last 100 rows of a given set of records? What about the second set
of 100 records? Or possibly, you have wanted to return the first so many
billing records for each billing month. This article will discuss how to use
the TOP clause to help solve those requests where you want to restrict the
number of records returned based on a record count. But first I will discuss
the new features incorporated into the TOP clause with SQL Server 2005.
New Features in TOP
Clause Implemented with SQL Server 2005
SQL Server 2005 brought a
couple of new features to the TOP clause. The syntax of the TOP clause in SQL
Server 2005 now looks like this:
TOP (expression) [PERCENT]
[ WITH TIES ]
In SQL Server 2005 you are
allowed to specify the number of rows you want to return with an expression,
where as in prior versions you where only allowed to hard code a number. Being
able to specify a variable allows you to more easily control the number of rows
returned programmatically.
The second new feature is
the WITH TIES options. This option instructs SQL Server to bring back
additional rows should the last row have the same order by value as the next
row. This option is only valid in a SELECT statement that contains an ORDER BY
clause.
To demonstrate how to use
these new TOP clause features let me go through a couple of examples.
Sample Data for Examples
Prior to going through my examples,
I will need a table with some test data. Below is my sample data and a script
to populate my sample TopDemo table:
|
PaymentID
|
Amount
|
PayDate
|
|
1
|
12.23
|
1/2/2006
|
|
2
|
73.18
|
1/10/2006
|
|
3
|
92.20
|
1/21/2006
|
|
4
|
62.20
|
1/21/2006
|
|
5
|
12.45
|
1/23/2006
|
|
6
|
10.99
|
2/4/2006
|
|
7
|
34.23
|
2/7/2006
|
|
8
|
34.87
|
2/12/2006
|
|
9
|
58.32
|
2/15/2006
|
|
10
|
34.23
|
2/23/2006
|
|
11
|
98.26
|
3/1/2006
|
|
12
|
45.32
|
3/15/2006
|
|
13
|
47.30
|
3/16/2006
|
|
14
|
35.21
|
3/24/2006
|
|
15
|
57.23
|
3/26/2006
|
|
16
|
24.56
|
3/30/2006
|
|
17
|
34.23
|
4/4/2006
|
|
18
|
23.99
|
4/6/2006
|
|
19
|
56.32
|
4/8/2006
|
|
20
|
47.53
|
6/25/2006
|
|
21
|
30.56
|
7/7/2006
|
|
22
|
65.34
|
7/8/2006
|
|
23
|
44.55
|
8/4/2006
|
|
24
|
84.23
|
8/8/2006
|
|
25
|
23.56
|
8/22/2006
|
|
26
|
45.77
|
9/12/2006
|
|
27
|
65.90
|
9/25/2006
|
set nocount on
create table TopDemo (
PaymentID int identity,
Amount money,
PayDate DateTime
)
insert into TopDemo(Amount, PayDate) values (12.23,'1/2/2006')
insert into TopDemo(Amount, PayDate) values (73.18,'1/10/2006')
insert into TopDemo(Amount, PayDate) values (92.20,'1/21/2006')
insert into TopDemo(Amount, PayDate) values (62.20,'1/21/2006')
insert into TopDemo(Amount, PayDate) values (12.45,'1/23/2006')
insert into TopDemo(Amount, PayDate) values (10.99,'2/4/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'2/7/2006')
insert into TopDemo(Amount, PayDate) values (34.87,'2/12/2006')
insert into TopDemo(Amount, PayDate) values (58.32,'2/15/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'2/23/2006')
insert into TopDemo(Amount, PayDate) values (98.26,'3/1/2006')
insert into TopDemo(Amount, PayDate) values (45.32,'3/15/2006')
insert into TopDemo(Amount, PayDate) values (47.30,'3/16/2006')
insert into TopDemo(Amount, PayDate) values (35.21,'3/24/2006')
insert into TopDemo(Amount, PayDate) values (57.23,'3/26/2006')
insert into TopDemo(Amount, PayDate) values (24.56,'3/30/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'4/4/2006')
insert into TopDemo(Amount, PayDate) values (23.99,'4/6/2006')
insert into TopDemo(Amount, PayDate) values (56.32,'4/8/2006')
insert into TopDemo(Amount, PayDate) values (47.53,'6/25/2006')
insert into TopDemo(Amount, PayDate) values (30.56,'7/7/2006')
insert into TopDemo(Amount, PayDate) values (65.34,'7/8/2006')
insert into TopDemo(Amount, PayDate) values (44.55,'8/4/2006')
insert into TopDemo(Amount, PayDate) values (84.23,'8/8/2006')
insert into TopDemo(Amount, PayDate) values (23.56,'8/22/2006')
insert into TopDemo(Amount, PayDate) values (45.77,'9/12/2006')
insert into TopDemo(Amount, PayDate) values (65.90,'9/25/2006')
Using an EXPRESSION in the TOP Clause
SQL Server 2005 makes it
easier to programmatically control the number of records you want to return
when you use the TOP clause. With these changes to the TOP clause, you can
now use a variable to identify the number of rows to return. Below is an
example that returns the top 3 records from my TopDemo table:
declare @top int
set @top = 3
select top (@top) * from TopDemo
Here I defined an integer
variable @top and set its value to 3. I then used that variable in the SELECT
statement immediately following the TOP clause and enclosed it in parentheses.
When the SQL Server 2005 engine executes this command, it evaluates the value
of the variable @top to determine the number of rows to return. In my example,
I return the following 3 records:
PaymentID Amount PayDate
----------- --------------------- -----------------------
1 12.23 2006-01-02 00:00:00.000
2 73.18 2006-01-10 00:00:00.000
3 92.20 2006-01-21 00:00:00.000
Now if you have some code that
determines the number of records you want to return, it is a simple matter of
just getting that record number into a variable, then placing the variable into
a TOP clause to constrain your returned record set to a specific number of
records. Being able to dynamically control the number of rows returned in SQL
Server 2005 means you no longer have to using dynamic SQL to accomplish this
task.
Using the WITH TIES option
SQL Server 2005 also offers
a WITH TIES option. This option allows you to specify that you want to return
all the records that have the same value as last records returned by the TOP
clause, based on the ORDER BY clause. Here is an example that demonstrates the
WITH TIES option:
select top 3 with ties * from TopDemo
order by PayDate
Here I am specifying that I
want to return the top 3 records and any additional records that have the same
PayDate as the third record. When I run this query this is my result set:
PaymentID Amount PayDate
----------- --------------------- -----------------------
1 12.23 2006-01-02 00:00:00.000
2 73.18 2006-01-10 00:00:00.000
3 92.20 2006-01-21 00:00:00.000
4 62.20 2006-01-21 00:00:00.000
As you can see, 4 records were
returned. This is because the third and forth record have the same PayDate. So
now, if you want to return all the records that have the same ORDER BY value,
but still what to loosely constrain your returned set to a number of records,
then the WITH TIES option should help.
Returning the Second Set of 4 Records
Some times, you dont want
to return the top set of so many records, but you want to return the next set
of so many records. Here is an example that returns the second set of 4
records:
select top 4 * from TopDemo a
where PaymentID in (select top 8 PaymentID from TopDemo)
order by PaymentID desc
In order to return the second
set of 4 records I used a correlated subquery (for more information on
correlated subqueries click here).
The correlated subquery returns PaymentID column values for the top 8
records. I then return only the first 4 matching TopDemo records that have one
of the 8 PaymentIDs returned from the subquery. Since I requested my final
set to be to be sorted in descending order by the PaymentID I will get the
second set of 4 records. Here is the result set from the above query:
PaymentID Amount PayDate
----------- --------------------- -----------------------
8 34.87 2006-02-12 00:00:00.000
7 34.23 2006-02-07 00:00:00.000
6 10.99 2006-02-04 00:00:00.000
5 12.45 2006-01-23 00:00:00.000
As you can see, this set is
ordered in descending sequence by PaymentID. If you really wanted the payments
to be sorted sequentially by PaymentID, starting with the lowest PaymentID
first, then you would have to run the following T-SQL statement, which sorts
the final results set in ascending order by PaymentID:
select * from
(select top 4 * from TopDemo a
where PaymentID in (select top 8 PaymentID from TopDemo)
order by PaymentID desc) a
order by PaymentID
Here I just placed an alias
named a around my original querys result set, so I could re-sort my four
records into PaymentID order.
How to get the BOTTOM 5 records
SQL Server does not supply a
BOTTOM clause, but that doesnt mean you cant easily get the BOTTOM so many records.
It is relatively simple to use the TOP clause to retrieve the bottom x number
of records. Here is an example where I retrieved the bottom 5 records based on
PayDate from my TopDemo table.
select top 5 * from TopDemo
order by PayDate desc
In order to return the
bottom five records all I had to do was specify the descending clause on my
order by statement. This returned the bottom 5 records based on PayDate, although
the final record set returned is ordered in descending order by PayDate. Below
is the result set returned when I run the above query:
PaymentID Amount PayDate
----------- --------------------- -----------------------
27 65.90 2006-09-25 00:00:00.000
26 45.77 2006-09-12 00:00:00.000
25 23.56 2006-08-22 00:00:00.000
24 84.23 2006-08-08 00:00:00.000
23 44.55 2006-08-04 00:00:00.000
Now normally you probably
would want to return the bottom so many records and have those records be
ordered based on a key in ascending order, instead of having them sort like my
above results set where the PayDate is in descending order. This can be solved
by wrapping another query around the above query, like so:
select * from (select top 5 * from TopDemo
order by PayDate desc) a
order by PayDate
The above query will order
my final result set in ascending PayDate order. Below is the result set of the
above query:
PaymentID Amount PayDate
----------- --------------------- -----------------------
23 44.55 2006-08-04 00:00:00.000
24 84.23 2006-08-08 00:00:00.000
25 23.56 2006-08-22 00:00:00.000
26 45.77 2006-09-12 00:00:00.000
27 65.90 2006-09-25 00:00:00.000
How to get the First Record of Each Month
For my last example, I will
show you how you can use the TOP clause to return the first TopDemo payment
record for each PayDate month. To accomplish this I will once again use a
correlated subquery to identify the top payment record for each month, as the
code below demonstrates:
select * from TopDemo a
where PayDate = (select top 1 PayDate from TopDemo
where month(PayDate) = Month(a.PayDate)
order by PayDate)
order by PayDate
Here the inner query
(select top 1.....) a correlated subquery takes the month identification from
the outer query to select the first PayDate for the that specific month. The
first PayDate of the month is then used in the WHERE clause to return the first
payment for each month. Below is the result set from the above correlated
subquery:
PaymentID Amount PayDate
----------- --------------------- -----------------------
1 12.23 2006-01-02 00:00:00.000
6 10.99 2006-02-04 00:00:00.000
11 98.26 2006-03-01 00:00:00.000
17 34.23 2006-04-04 00:00:00.000
20 47.53 2006-06-25 00:00:00.000
21 30.56 2006-07-07 00:00:00.000
23 44.55 2006-08-04 00:00:00.000
26 45.77 2006-09-12 00:00:00.000
In my T-SQL code above, the
inner query returns 8 different PayDates, one for the first payment date of each
month that existed in my TopDemo table. These unique payment dates were then
used in the outer query to return any TopDemo records that had one of those 8
payment dates.
Conclusion
The TOP clause is useful in
helping you constrain your set to specific number of records. The new SQL
Server 2005 features of the TOP clause now provides you with some alternatives
for how to write your code for some specific data retrieve criteria. Being
able to dynamically control the number of records returned is a big help when
you dont know exactly how many records you want to return. Now you can use
the WITH TIES option as an easy method to return those records that have the
same value as the last record in your TOP criteria.
»
See All Articles by Columnist Gregory A. Larsen