Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Dec 19, 2006

Creative Ways to Use the TOP Clause

By Gregory A. Larsen

Have you ever wanted to return the last 100 rows of a given set of records? What about the second set of 100 records? Or possibly, you have wanted to return the first so many billing records for each billing month. This article will discuss how to use the TOP clause to help solve those requests where you want to restrict the number of records returned based on a record count. But first I will discuss the new features incorporated into the TOP clause with SQL Server 2005.

New Features in TOP Clause Implemented with SQL Server 2005

SQL Server 2005 brought a couple of new features to the TOP clause. The syntax of the TOP clause in SQL Server 2005 now looks like this:

TOP (expression) [PERCENT]
     [ WITH TIES ]

In SQL Server 2005 you are allowed to specify the number of rows you want to return with an expression, where as in prior versions you where only allowed to hard code a number. Being able to specify a variable allows you to more easily control the number of rows returned programmatically.

The second new feature is the “WITH TIES” options. This option instructs SQL Server to bring back additional rows should the last row have the same order by value as the next row. This option is only valid in a SELECT statement that contains an ORDER BY clause.

To demonstrate how to use these new TOP clause features let me go through a couple of examples.

Sample Data for Examples

Prior to going through my examples, I will need a table with some test data. Below is my sample data and a script to populate my sample TopDemo table:

PaymentID

Amount

PayDate

1

12.23

1/2/2006

2

73.18

1/10/2006

3

92.20

1/21/2006

4

62.20

1/21/2006

5

12.45

1/23/2006

6

10.99

2/4/2006

7

34.23

2/7/2006

8

34.87

2/12/2006

9

58.32

2/15/2006

10

34.23

2/23/2006

11

98.26

3/1/2006

12

45.32

3/15/2006

13

47.30

3/16/2006

14

35.21

3/24/2006

15

57.23

3/26/2006

16

24.56

3/30/2006

17

34.23

4/4/2006

18

23.99

4/6/2006

19

56.32

4/8/2006

20

47.53

6/25/2006

21

30.56

7/7/2006

22

65.34

7/8/2006

23

44.55

8/4/2006

24

84.23

8/8/2006

25

23.56

8/22/2006

26

45.77

9/12/2006

27

65.90

9/25/2006

set nocount on
create table TopDemo (
PaymentID int identity, 
Amount money, 
PayDate DateTime
)
insert into TopDemo(Amount, PayDate) values (12.23,'1/2/2006')
insert into TopDemo(Amount, PayDate) values (73.18,'1/10/2006')
insert into TopDemo(Amount, PayDate) values (92.20,'1/21/2006')
insert into TopDemo(Amount, PayDate) values (62.20,'1/21/2006')
insert into TopDemo(Amount, PayDate) values (12.45,'1/23/2006')
insert into TopDemo(Amount, PayDate) values (10.99,'2/4/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'2/7/2006')
insert into TopDemo(Amount, PayDate) values (34.87,'2/12/2006')
insert into TopDemo(Amount, PayDate) values (58.32,'2/15/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'2/23/2006')
insert into TopDemo(Amount, PayDate) values (98.26,'3/1/2006')
insert into TopDemo(Amount, PayDate) values (45.32,'3/15/2006')
insert into TopDemo(Amount, PayDate) values (47.30,'3/16/2006')
insert into TopDemo(Amount, PayDate) values (35.21,'3/24/2006')
insert into TopDemo(Amount, PayDate) values (57.23,'3/26/2006')
insert into TopDemo(Amount, PayDate) values (24.56,'3/30/2006')
insert into TopDemo(Amount, PayDate) values (34.23,'4/4/2006')
insert into TopDemo(Amount, PayDate) values (23.99,'4/6/2006')
insert into TopDemo(Amount, PayDate) values (56.32,'4/8/2006')
insert into TopDemo(Amount, PayDate) values (47.53,'6/25/2006')
insert into TopDemo(Amount, PayDate) values (30.56,'7/7/2006')
insert into TopDemo(Amount, PayDate) values (65.34,'7/8/2006')
insert into TopDemo(Amount, PayDate) values (44.55,'8/4/2006')
insert into TopDemo(Amount, PayDate) values (84.23,'8/8/2006')
insert into TopDemo(Amount, PayDate) values (23.56,'8/22/2006')
insert into TopDemo(Amount, PayDate) values (45.77,'9/12/2006')
insert into TopDemo(Amount, PayDate) values (65.90,'9/25/2006')

Using an EXPRESSION in the TOP Clause

SQL Server 2005 makes it easier to programmatically control the number of records you want to return when you use the TOP clause. With these changes to the TOP clause, you can now use a variable to identify the number of rows to return. Below is an example that returns the top 3 records from my TopDemo table:

declare @top int
set @top = 3
select top (@top) * from TopDemo

Here I defined an integer variable @top and set its value to 3. I then used that variable in the SELECT statement immediately following the TOP clause and enclosed it in parentheses. When the SQL Server 2005 engine executes this command, it evaluates the value of the variable @top to determine the number of rows to return. In my example, I return the following 3 records:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
1           12.23                 2006-01-02 00:00:00.000
2           73.18                 2006-01-10 00:00:00.000
3           92.20                 2006-01-21 00:00:00.000

Now if you have some code that determines the number of records you want to return, it is a simple matter of just getting that record number into a variable, then placing the variable into a TOP clause to constrain your returned record set to a specific number of records. Being able to dynamically control the number of rows returned in SQL Server 2005 means you no longer have to using dynamic SQL to accomplish this task.

Using the WITH TIES option

SQL Server 2005 also offers a WITH TIES option. This option allows you to specify that you want to return all the records that have the same value as last records returned by the TOP clause, based on the ORDER BY clause. Here is an example that demonstrates the WITH TIES option:

select top 3 with ties * from TopDemo 
order by PayDate

Here I am specifying that I want to return the top 3 records and any additional records that have the same PayDate as the third record. When I run this query this is my result set:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
1           12.23                 2006-01-02 00:00:00.000
2           73.18                 2006-01-10 00:00:00.000
3           92.20                 2006-01-21 00:00:00.000
4           62.20                 2006-01-21 00:00:00.000

As you can see, 4 records were returned. This is because the third and forth record have the same PayDate. So now, if you want to return all the records that have the same ORDER BY value, but still what to loosely constrain your returned set to a number of records, then the WITH TIES option should help.

Returning the Second Set of 4 Records

Some times, you don’t want to return the top set of so many records, but you want to return the next set of so many records. Here is an example that returns the second set of 4 records:

select top 4 * from TopDemo a 
 where PaymentID in (select top 8 PaymentID from TopDemo) 
    order by PaymentID desc

In order to return the second set of 4 records I used a correlated subquery (for more information on correlated subqueries click here). The correlated subquery returns “PaymentID” column values for the top 8 records. I then return only the first 4 matching TopDemo records that have one of the 8 PaymentID’s returned from the subquery. Since I requested my final set to be to be sorted in descending order by the PaymentID I will get the second set of 4 records. Here is the result set from the above query:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
8           34.87                 2006-02-12 00:00:00.000
7           34.23                 2006-02-07 00:00:00.000
6           10.99                 2006-02-04 00:00:00.000
5           12.45                 2006-01-23 00:00:00.000

As you can see, this set is ordered in descending sequence by PaymentID. If you really wanted the payments to be sorted sequentially by PaymentID, starting with the lowest PaymentID first, then you would have to run the following T-SQL statement, which sorts the final results set in ascending order by PaymentID:

select * from 
  (select top 4 * from TopDemo a 
   where PaymentID in (select top 8 PaymentID from TopDemo) 
   order by PaymentID desc) a
order by PaymentID

Here I just placed an alias named “a” around my original query’s result set, so I could re-sort my four records into PaymentID order.

How to get the BOTTOM 5 records

SQL Server does not supply a BOTTOM clause, but that doesn’t mean you can’t easily get the BOTTOM so many records. It is relatively simple to use the TOP clause to retrieve the bottom “x” number of records. Here is an example where I retrieved the bottom 5 records based on PayDate from my TopDemo table.

select top 5 * from TopDemo 
order by PayDate desc

In order to return the bottom five records all I had to do was specify the descending clause on my order by statement. This returned the bottom 5 records based on PayDate, although the final record set returned is ordered in descending order by PayDate. Below is the result set returned when I run the above query:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
27          65.90                 2006-09-25 00:00:00.000
26          45.77                 2006-09-12 00:00:00.000
25          23.56                 2006-08-22 00:00:00.000
24          84.23                 2006-08-08 00:00:00.000
23          44.55                 2006-08-04 00:00:00.000

Now normally you probably would want to return the bottom so many records and have those records be ordered based on a key in ascending order, instead of having them sort like my above results set where the PayDate is in descending order. This can be solved by wrapping another query around the above query, like so:

select * from (select top 5 * from TopDemo 
order by PayDate desc) a
order by PayDate

The above query will order my final result set in ascending PayDate order. Below is the result set of the above query:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
23          44.55                 2006-08-04 00:00:00.000
24          84.23                 2006-08-08 00:00:00.000
25          23.56                 2006-08-22 00:00:00.000
26          45.77                 2006-09-12 00:00:00.000
27          65.90                 2006-09-25 00:00:00.000

How to get the First Record of Each Month

For my last example, I will show you how you can use the TOP clause to return the first TopDemo payment record for each PayDate month. To accomplish this I will once again use a correlated subquery to identify the top payment record for each month, as the code below demonstrates:

select * from TopDemo a 
 where PayDate = (select top 1 PayDate from TopDemo 
    where month(PayDate) = Month(a.PayDate)
    order by PayDate) 
order by PayDate

Here the inner query (“select top 1.....”) a correlated subquery takes the month identification from the outer query to select the first PayDate for the that specific month. The first PayDate of the month is then used in the WHERE clause to return the first payment for each month. Below is the result set from the above correlated subquery:

PaymentID   Amount                PayDate
----------- --------------------- -----------------------
1           12.23                 2006-01-02 00:00:00.000
6           10.99                 2006-02-04 00:00:00.000
11          98.26                 2006-03-01 00:00:00.000
17          34.23                 2006-04-04 00:00:00.000
20          47.53                 2006-06-25 00:00:00.000
21          30.56                 2006-07-07 00:00:00.000
23          44.55                 2006-08-04 00:00:00.000
26          45.77                 2006-09-12 00:00:00.000

In my T-SQL code above, the inner query returns 8 different PayDate’s, one for the first payment date of each month that existed in my TopDemo table. These unique payment dates were then used in the outer query to return any TopDemo records that had one of those 8 payment dates.

Conclusion

The TOP clause is useful in helping you constrain your set to specific number of records. The new SQL Server 2005 features of the TOP clause now provides you with some alternatives for how to write your code for some specific data retrieve criteria. Being able to dynamically control the number of records returned is a big help when you don’t know exactly how many records you want to return. Now you can use the WITH TIES option as an easy method to return those records that have the same value as the last record in your TOP criteria.

» See All Articles by Columnist Gregory A. Larsen



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM