Friday, November 02, 2007

Getting the index or rank of rows in SQL Server 2005

A while ago I wrote a little post about changing a paging GridView in order to show certain number of pages and page index, but extract from the database only the data you needed. I was looking there for something in SQL Server like the MySql LIMIT, the T-Sql equivalent of TOP, but which accepts two parameters instead of only one.

I just found out that there is a way of doing this in Sql Server 2005 by employing functions that return the position of a row, depending on a certain ordering and partition, called Ranking Functions. From the little googling I did, it seems that Microsoft borrowed this technique from Oracle, but one never knows, maybe it was the other way around.

I will write a short example on ranking functions, then link to the appropiate articles. Given this SQL query:
select *,
Row_Number() OVER (ORDER BY name) as [Index],
Rank() OVER (ORDER BY name) as Rank,
Dense_Rank() OVER (ORDER BY name) as DenseRank,
NTile(3) OVER (ORDER BY name) as NTile,
Row_Number() OVER (PARTITION BY nr ORDER BY name) as IndexNr,
Rank() OVER (PARTITION BY nr ORDER BY name) as RankNr,
Dense_Rank() OVER (PARTITION BY nr ORDER BY name) as DenseRankNr,
NTile(3) OVER (PARTITION BY nr ORDER BY name) as NTileNr
from test
ORDER BY ID

you get the following result:
ID Name Nr Nr2 Index Rank DRank Ntile IndexNr RankNr DRankNr NtileNr
1 Mark 1 7 8 7 4 2 2 2 2 2
2 Mike 1 4 11 11 6 3 3 3 3 3
3 John 2 8 5 5 3 2 1 1 1 1
4 Charles 3 2 1 1 1 1 1 1 1 1
5 Ellen 3 6 4 3 2 1 2 2 2 2
6 Mary 4 1 9 9 5 3 2 2 2 2
7 Mark 4 17 7 7 4 2 1 1 1 1
8 Mike 2 41 12 11 6 3 2 2 2 2
9 John 6 83 6 5 3 2 1 1 1 1
10 Charles 1 72 2 1 1 1 1 1 1 1
11 Ellen 0 68 3 3 2 1 1 1 1 1
12 Mary 3 21 10 9 5 3 3 3 3 3


As you can see, Row_Number returns the row index, Rank returns the rank, Dense_Rank returns consecutive rank (no gaps between rank numbers) while NTile puts each row in a category using a given number of total categories. Partition by makes the operations work for each distinct value of a certain column, in this case nr. If the partition would have been on nr2, all the ranking values would have equaled 1, since there are only distinct values on the nr2 column. The Over clause can be used on more than just ranking functions; it also works on Aggregate functions. Yummy!

Links:
Ranking Functions (Transact-SQL)
OVER Clause (Transact-SQL)
Aggregate Functions (Transact-SQL)
This article also shows a similar method in Sql Server 2000 of which I knew nothing until today: Row_Number() function in SQL Server 2005
Returning Ranked Results with Microsoft SQL Server 2005

0 comments: