MS SQL Server slows down over time?

Have any of you experienced the following, and have you found a solution:

A large part of our website's back-end is MS SQL Server 2005. Every week or two weeks the site begins running slower - and I see queries taking longer and longer to complete in SQL. I have a query that I like to use:

USE master
select text,wait_time,blocking_session_id AS "Block",
percent_complete, * from sys.dm_exec_requests 
CROSS APPLY sys.dm_exec_sql_text(sql_handle)  AS s2 order by start_time asc

Which is fairly useful... it gives a snapshot of everything that's running right at that moment against your SQL server. What's nice is that even if your CPU is pegged at 100% for some reason and Activity Monitor is refusing to load (I'm sure some of you have been there) this query still returns and you can see what query is killing your DB.

When I run this, or Activity Monitor during the times that SQL has begun to slow down I don't see any specific queries causing the issue - they are ALL running slower across the board. If I restart the MS SQL Service then everything is fine, it speeds right up - for a week or two until it happens again.

Nothing that I can think of has changed, but this just started a few months ago... Ideas?

--Added

Please note that when this database slowdown happens it doesn't matter if we are getting 100K page views an hour (busier time of day) or 10K page views an hour (slow time) the queries all take a longer time to complete than normal. The server isn't really under stress - the CPU isn't high, the disk usage doesn't seem to be out of control... it feels like index fragmentation or something of the sort but that doesn't seem to be the case.

As far as pasting results of the query I pasted above I really can't do that. The Query above lists the login of the user performing the task, the entire query, etc etc.. and I'd really not like to hand out the names of my databases, tables, columns and the logins online :)... I can tell you that the queries running at that time are normal, standard queries for our site that run all the time, nothing out of the norm.

--March 24th

It's been about two weeks since the last reboot. I made several changes: I found a few queries where we were making heavy use of temp tables that were totally unnecessary and had our developers change how they were doing it. I adjusted the size of some of the constantly (slowly but surely) growing databases to an intelligent size for their growth. I adjusted the autogrowth settings for everything as well to be more intelligent (they were ALL set to 1MB growth). Lastly I cleaned up MSDB a bit. We do log shipping and really didn't need to keep years and years worth of backup points, I've written some scripts that keep this to only a few months. I'll keep updating this thread, as it's too early to tell if the problem is solved yet.


We found it. It turned out that it was actually a web server that had a problem with one of it's app pools. It would get stuck running the same set of queries over and over (which happened to deal in temp tables). It would just loop and loop and eventually cause the SQL server to be sad. Once this offending machine / app pool was found and 'put down' everything was resolved.


You have to ask yourself, what happens at a SQL service restart? Lots of stuff, but two relevant points come to mind:

1) SQL memory is freed.

Its possible (not sure how likely), that if your MaxMemory setting is set too high, that the SQL service grows to use all available memory, and Windows starts to swap important stuff out to the swap file. Check to make sure that MaxMemory is set to a reasonable value, leaving enough additional memory for whatever else needs to run on that box (is it a dedicated SQL server? Or is it also the app server?)

2) TempDB is rebuilt from the default sizes.

Check your default tempdb file sizes, especially the default size and growth interval of the TempDB Log file. If the growth interval is set too LOW, then the log can build up some incredible internal fragmentation, which can dramatically slow down normal usage. See these two excellent blog articles by Kimberly Tripp.


Do you make heavy use of temporary tables or cursors? Check any cursors are being closed and deallocated correctly. Also watch out for linked servers - we've got to use a buggy driver for an old linked Informix server and it periodically means we have to reboot the server.