Diagnosing Deadlocks in SQL Server 2005

We're seeing some pernicious, but rare, deadlock conditions in the Stack Overflow SQL Server 2005 database.

I attached the profiler, set up a trace profile using this excellent article on troubleshooting deadlocks, and captured a bunch of examples. The weird thing is that the deadlocking write is always the same:

UPDATE [dbo].[Posts]
SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3
WHERE [Id] = @p0

The other deadlocking statement varies, but it's usually some kind of trivial, simple read of the posts table. This one always gets killed in the deadlock. Here's an example

SELECT
[t0].[Id], [t0].[PostTypeId], [t0].[Score], [t0].[Views], [t0].[AnswerCount], 
[t0].[AcceptedAnswerId], [t0].[IsLocked], [t0].[IsLockedEdit], [t0].[ParentId], 
[t0].[CurrentRevisionId], [t0].[FirstRevisionId], [t0].[LockedReason],
[t0].[LastActivityDate], [t0].[LastActivityUserId]
FROM [dbo].[Posts] AS [t0]
WHERE [t0].[ParentId] = @p0

To be perfectly clear, we are not seeing write / write deadlocks, but read / write.

We have a mixture of LINQ and parameterized SQL queries at the moment. We have added with (nolock) to all the SQL queries. This may have helped some. We also had a single (very) poorly-written badge query that I fixed yesterday, which was taking upwards of 20 seconds to run every time, and was running every minute on top of that. I was hoping this was the source of some of the locking problems!

Unfortunately, I got another deadlock error about 2 hours ago. Same exact symptoms, same exact culprit write.

The truly strange thing is that the locking write SQL statement you see above is part of a very specific code path. It's only executed when a new answer is added to a question -- it updates the parent question with the new answer count and last date/user. This is, obviously, not that common relative to the massive number of reads we are doing! As far as I can tell, we're not doing huge numbers of writes anywhere in the app.

I realize that NOLOCK is sort of a giant hammer, but most of the queries we run here don't need to be that accurate. Will you care if your user profile is a few seconds out of date?

Using NOLOCK with Linq is a bit more difficult as Scott Hanselman discusses here.

We are flirting with the idea of using

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

on the base database context so that all our LINQ queries have this set. Without that, we'd have to wrap every LINQ call we make (well, the simple reading ones, which is the vast majority of them) in a 3-4 line transaction code block, which is ugly.

I guess I'm a little frustrated that trivial reads in SQL 2005 can deadlock on writes. I could see write/write deadlocks being a huge issue, but reads? We're not running a banking site here, we don't need perfect accuracy every time.

Ideas? Thoughts?


Are you instantiating a new LINQ to SQL DataContext object for every operation or are you perhaps sharing the same static context for all your calls?

Jeremy, we are sharing one static datacontext in the base Controller for the most part:

private DBContext _db;
/// <summary>
/// Gets the DataContext to be used by a Request's controllers.
/// </summary>
public DBContext DB
{
    get
    {
        if (_db == null)
        {
            _db = new DBContext() { SessionName = GetType().Name };
            //_db.ExecuteCommand("SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED");
        }
        return _db;
    }
}

Do you recommend we create a new context for every Controller, or per Page, or .. more often?


According to MSDN:

http://msdn.microsoft.com/en-us/library/ms191242.aspx

When either the READ COMMITTED SNAPSHOT or ALLOW SNAPSHOT ISOLATION database options are ON, logical copies (versions) are maintained for all data modifications performed in the database. Every time a row is modified by a specific transaction, the instance of the Database Engine stores a version of the previously committed image of the row in tempdb. Each version is marked with the transaction sequence number of the transaction that made the change. The versions of modified rows are chained using a link list. The newest row value is always stored in the current database and chained to the versioned rows stored in tempdb.

For short-running transactions, a version of a modified row may get cached in the buffer pool without getting written into the disk files of the tempdb database. If the need for the versioned row is short-lived, it will simply get dropped from the buffer pool and may not necessarily incur I/O overhead.

There appears to be a slight performance penalty for the extra overhead, but it may be negligible. We should test to make sure.

Try setting this option and REMOVE all NOLOCKs from code queries unless it’s really necessary. NOLOCKs or using global methods in the database context handler to combat database transaction isolation levels are Band-Aids to the problem. NOLOCKS will mask fundamental issues with our data layer and possibly lead to selecting unreliable data, where automatic select / update row versioning appears to be the solution.

ALTER Database [StackOverflow.Beta] SET READ_COMMITTED_SNAPSHOT ON

NOLOCK and READ UNCOMMITTED are a slippery slope. You should never use them unless you understand why the deadlock is happening first. It would worry me that you say, "We have added with (nolock) to all the SQL queries". Needing to add WITH NOLOCK everywhere is a sure sign that you have problems in your data layer.

The update statement itself looks a bit problematic. Do you determine the count earlier in the transaction, or just pull it from an object? AnswerCount = AnswerCount+1 when a question is added is probably a better way to handle this. Then you don't need a transaction to get the correct count and you don't have to worry about the concurrency issue that you are potentially exposing yourself to.

One easy way to get around this type of deadlock issue without a lot of work and without enabling dirty reads is to use "Snapshot Isolation Mode" (new in SQL 2005) which will always give you a clean read of the last unmodified data. You can also catch and retry deadlocked statements fairly easily if you want to handle them gracefully.


The OP question was to ask why this problem occured. This post hopes to answer that while leaving possible solutions to be worked out by others.

This is probably an index related issue. For example, lets say the table Posts has a non-clustered index X which contains the ParentID and one (or more) of the field(s) being updated (AnswerCount, LastActivityDate, LastActivityUserId).

A deadlock would occur if the SELECT cmd does a shared-read lock on index X to search by the ParentId and then needs to do a shared-read lock on the clustered index to get the remaining columns while the UPDATE cmd does a write-exclusive lock on the clustered index and need to get a write-exclusive lock on index X to update it.

You now have a situation where A locked X and is trying to get Y whereas B locked Y and is trying to get X.

Of course, we'll need the OP to update his posting with more information regarding what indexes are in play to confirm if this is actually the cause.


I'm pretty uncomfortable about this question and the attendant answers. There's a lot of "try this magic dust! No that magic dust!"

I can't see anywhere that you've anaylzed the locks that are taken, and determined what exact type of locks are deadlocked.

All you've indicated is that some locks occur -- not what is deadlocking.

In SQL 2005 you can get more info about what locks are being taken out by using:

DBCC TRACEON (1222, -1)

so that when the deadlock occurs you'll have better diagnostics.


Are you instantiating a new LINQ to SQL DataContext object for every operation or are you perhaps sharing the same static context for all your calls? I originally tried the latter approach, and from what I remember, it caused unwanted locking in the DB. I now create a new context for every atomic operation.