So I'm working on improving the performance of the next version of Channel 9. For those of you not familiar, our team created a new platform for our department to build community sites (blogs, forums, videos, tagging, etc). You can see an example of the platform running on 10. Currently, there's not a whole lot of data in the 10 database so we've been able to get away with murder from a performance standpoint. Now that I've imported all the data from Community Server (Channel 9) to 10 (EvNet Platform), we've been finding that 250,000 rows doesn't perform very well with our current code.
The architecture of our platform is pretty simple. There's an Entry table that houses Entries, Threads & Comments and the relationships between them. When you're viewing a blog, you're looking at this table. When you're looking at a forum, you're looking at this table. We differentiate everything in our system by tags. There's one for each forum, each blog, and content tags to describe what's in each entry.
Currently, to do paging we've been using the ROW_NUMBER OVER function. It's a new feature of SQL Server 2005 and is very simple and easy to use. Unfortunately, it doesn't work very well with a lot of rows (250,000 for example). I did some searching and came across this gem of an article. It uses an interesting trick to use SET ROWCOUNT to get the first record to start with in a paged result set, then you just run the query again and set the row count again to the number you want where the values are greater than the first row from the starting point of the paged result set and man is it snappy. Do check it out if you're having troubles with performance of ROW_NUMBER() OVER.
Oh yah, I forgot to mention how much of an improvement this change made. Before the change queries were taking about 8 seconds on average. After the change, the queries now take less than 1 second. Depending on if SQL has decided to cache the results or not, it's pretty much instantaneous.