There are times when you don’t what to return a complete set of records. An example where you might need this requirement would be in an inventory system when you want to produce a pick list where the stock you want to ship has the oldest manufacture dates. When you have this kind of requirement to only select the TOP X number of items Transact SQL (TSQL) has the TOP clause to meet your needs. In this article I will be exploring how to select a subset of data from a record set.
Syntax of the TOP Clause
Here is the syntax for the TOP clause:
TOP (expression) [PERCENT] [WITH TIES]
Where:
(expression) – The expression parameter is a required field. With this parameter you provide the number of records you want returned. This number can be an integer or a float data type depending on whether the PERCENT parameter is included. This parameter is a BIGINT if the PERCENT parameter is not present or FLOAT if the PERCENT parameter is included.
[PERCENT] – The PERCENT option is optional. When PERCENT is specified this means the (expression) parameter identifies the percentage of rows to return, otherwise the (expression) parameter identifies the number of rows returned.
[WITH TIES] – The WITH TIES options is used to identify how SQL Server handles returning rows that have the same column values as the last row. This options is only valid when the SELECT command contains an ORDER BY clause. When the WITH TIES option is specified any rows that have the same column values as the last rows will also be returned. Therefore when you specify the WITH TIES option you might get back more rows than the (expression) value.
The TOP clause supports being used with SELECT, INSERT, UPDATE and DELETE statements. In this article I will be covering using the TOP clause with a SELECT statement. In a follow up article I will explore how to use the TOP clause with the INSERT, UPDATE, and DELETE clause.
Sample Data
Before I can show you any examples I must first create some sample data on which to run the TOP clause against. The code below will create my sample data.
USE tempdb; GO CREATE TABLE dbo.Sales ( SalesID int, SalesDate date, Region char(10), SalesAmount money); INSERT INTO dbo.Sales VALUES (1,'12-01-2014','NorthWest',102.99), (2,'12-01-2014','SouthWest',41.95), (3,'12-01-2014','NorthEast',59.55), (4,'12-01-2014','NorthEast',20.99), (5,'12-01-2014','NorthWest',400.78), (6,'12-01-2014','SouthWest',1012.31), (7,'12-01-2014','SouthWest',10.99), (8,'12-01-2014','NorthEast',125.47), (9,'12-01-2014','SouthWest',129.56), (10,'12-01-2014','SouthEast',32.43), (11,'12-01-2014','SouthEast',82.10), (12,'12-01-2014','NorthWest',111.92);
If you want to run the examples below then you can use the script above to create the same sample data I will be using for my examples.
Simple TOP Statement
For the first example let me select the top 2 rows from my sample data. The code below will return the TOP 2 rows of data from my sample data:
SELECT TOP(2) * FROM dbo.Sales;
When I run this code it returns the following results:
SalesID SalesDate Region SalesAmount ----------- ---------- ---------- --------------------- 1 2014-12-01 NorthWest 102.99 2 2014-12-01 SouthWest 41.95
As you can see it selected the first two Sales table records. Because my command above doesn’t have an ORDER BY clause SQL Server doesn’t guarantee the two rows returned are from a sorted set. If you really want to bring back two rows from a sorted set you need to include the ORDER BY clause as I have done with the SELECT query below:
SELECT TOP(2) * FROM dbo.Sales ORDER BY SalesAmount DESC;
When I run this code the following results are returned:
SalesID SalesDate Region SalesAmount ----------- ---------- ---------- --------------------- 6 2014-12-01 SouthWest 1012.31 5 2014-12-01 NorthWest 400.78
Here you can see that I brought back two rows from the Sales table. But this time because of the ORDER BY clause they are the top two rows based on the value in the SalesAmount column.
Using a Variable in the TOP Statement
There are times when you want to have the number of rows returned to be dynamic. When you have this requirement the TOP clause has you covered. To dynamically return the top number of rows you just need to use an expression in the TOP clause. The expression just needs to return the number of rows that you dynamically want to return. The example below shows you one way to dynamically return a specific number of rows using the top clause:
DECLARE @Top int; SET @Top = 2; SELECT TOP(@Top) * FROM dbo.Sales ORDER BY SalesAmount DESC;
In the code above I first declared an integer variable named @Top. I then set the value of @Top to a value of 2. Then I ran the SELECT statement that uses the @Top value to identify the number of rows to return in the TOP clause. When I run this code it produces the same set of records as the prior example.
You can also use an expression in the top clause as I have done in the following chunk of code:
SELECT TOP(MONTH(getdate())) * FROM dbo.Sales ORDER BY SalesAmount DESC;
This code uses the expression “MONTH(getdate())” to determine dynamically the value for the TOP clause. This code will return the number of rows based on what month it is. So in January this code will only return one row, in February it will return two rows, in March it will return three rows, etc. Therefore this code could return anywhere from 1 to 12 rows depending on which month the SELECT statement is run.
Using the Percentage value with the TOP Clause
Are there times when you want to return some percentage of rows within a table? When you have this requirement you can use the PERCENT option. To demonstrate this review the code below:
SELECT TOP(13) PERCENT * FROM dbo.Sales ORDER BY SalesAmount DESC;
When I run this code it returns the following rows:
SalesID SalesDate Region SalesAmount ----------- ---------- ---------- --------------------- 6 2014-12-01 SouthWest 1012.31 5 2014-12-01 NorthWest 400.78
In this example, I specified that I wanted to return 13 percent of the row. Because I added the PERCENT clause to my SELECT TOP query this tells SQL Server the value provide is a percentage number instead of an integer number. In the example above 13 percent of 12 rows (total number of rows in my table) is 1.56. SQL Server returns two rows from this query because it took the calculated percentage value, 1.56 in this example, and rounded it up to 2. If I had decided I only wanted 12 percent of the rows, which is 1.44 rows, SQL Server would have rounded down and returned only one row.
Understanding the TIES Option
What does SQL Server do when you want the TOP three rows from an ordered set, but the third and fourth row of that set have the same value for the columns in the ORDER BY clause? If you use the TOP clause with the no additional options it will only return three rows. But what if you want return that fourth row because it has the same value as the third row for the ordered set? If you want to return those rows that have the same value as the last row then you can use the TIES option as I have done below:
SELECT TOP (3) WITH TIES * FROM dbo.Sales ORDER BY Region DESC;
In this query I used the WITH TIES option to bring back any rows that have the same Region value as the third row. When I run this query the following rows are returned:
SalesID SalesDate Region SalesAmount ----------- ---------- ---------- --------------------- 2 2014-12-01 SouthWest 41.95 9 2014-12-01 SouthWest 129.56 6 2014-12-01 SouthWest 1012.31 7 2014-12-01 SouthWest 10.99
By using the WITH TIES option you can see that I returned four rows instead of just three. In fact I could have said TOP (1) WITH TIES and my query would have returned the same results. This is because the first row has the same value for Region as the rest of the row. One thing to note is when you use the WITH TIES option the query will require an ORDER BY clause. If you forget the ORDER BY clause when you use the PERCENT option the query will get an error.
Getting a Subset of Records with the TOP Clause
There are times when your application requires you to return a specific number of rows. Using the TOP clause provides a simple way to return a subset of records based on the order in which the rows are returned. Next time you have a need to only return the top so many rows from a set consider if the TOP clause will be a simple way to accomplish this.