My templog.ldf is huge (45gb), What if anything should I do?

Check your reporting queries. Do you have any that have DISTINCT in them? Do any of them have a cartesian joins?

Do any of the reporting queries access linked servers as members of a join? If so this can cause tempdb log and database to grow.

When the reports are running in the morning do any of them crash?


We had a similar issue, after having raised PSS call with Microsoft and in-depth investigation of the issue we zoned into the following possible cause and resolution.

Cause:

The probable cause for the symptoms are due to disks/lun's on which user databases are placed having severe I/O response issues; this causes the automatic checkpoint on user databases to take very long to finish.

Now, checkpoint on tempdb occurs only when the tempdb log becomes 70% full and also it has a lower priority than user database checkpoints. So, effectively when automatic checkpoint on user database/s is issued and is trying to complete, due to heavy tempdb usage causes the tempdb log file to fill up quickly; at 70% log usage the tempdb checkpoint occurs but is queued behind the user database checkpoint.

In the time it takes for the user database checkpoint to finish the tempdb log file keeps getting filled up and if autogrow is set the log file grows when it requires more space. This is the reason the log file keeps growing.

In summary, the most possible root cause for the symptoms you describe are due to poor I/O response from the disks/lun's for your user and/or tempdb database/log files.

Solution:

We worked around the issue while we sorted out the I/O subsystem by setting up an alert which fired when the tempdb log file became 75% full and in response executed a job which forced a manual "CHECKPOINT"(which takes precedence over automatic system checkpoints), clearing out the tempdb log preventing it from auto-growing indefinitely. It is still a good idea to leave the log file on auto grow for any other eventuality. Also, I strongly recommend you to consider reducing the tempdb log file size to something meaningful as per your environment after you put the fix in.

Hope this helps.


What is the Recovery Model set to on the temp db? If it's not set to Simple, then set it to Simple. This should keep it from growing. If it's already set to Simple then I'd say there's an underlying problem that needs to be addressed and any attempt to shrink the file is merely treating the symptoms of the problem and not the root cause.


I've spent the last few hours reading and making notes on this

http://technet.microsoft.com/en-gb/library/cc966545.aspx

There's a lot of detail in there and suggestions for trouble shooting issues. It seems that unless your tempdb is expanding and never stops growing it's probably just taking up the amount of space it needs and should have been configured to be that size initially. There is a section on estimating the space required for your tempdb as well as tracking down what might be taking up space in tempdb. As a result of this the first thing I'm going to do is move tempdb to a larger drive and see what happens from there.

There is a section titled 'Space required for tempdb logging' which indicates which features use the log, there is another earlier section which details the superset of features which use tempdb.

The section titled 'Monitoring I/O' has a few ideas on performance counters to watch, a quick look at my server put these in you've-probably-got-an-io-bottleneck territory. I'll monitor these for a while and see how things pan out. The tempdb log file was also actually at less than 50% utilisation which fits with the idea it expanded under load this morning and has retained that space since.

I'm going ahead on the basis that the size it's grown to is the size it needs to be, monitor this size in future and make sure there's room for growth on whatever drive it's on. As suggested by some here I'll look into what is executing as the temp log expands and see if anything can be tweaked in there. I'll also keep an eye on those io performance counters to see if something needs dealing with.

There was one more additional interesting section titled 'Upgrading to SQL Server 2005' which indicates that tempdb is used for more things in 2005 than 2000 (both new features, and existing features which previously didn't use tempdb). I've only recently upgraded to 2005 so this could be part of the reason this has suddenly become an issue. I don't remember seeing this anywhere else with reference to upgrading to 2005 though, which is a bit of a pain.