Violation of UNIQUE KEY constraint on INSERT WHERE COUNT(*) = 0 on SQL Server 2005
I'm inserting into a SQL database from multiple processes. It's likely that the processes will sometimes try to insert duplicate data into the table. I've tried to write the query in a way that will handle the duplicates but I still get:
System.Data.SqlClient.SqlException: Violation of UNIQUE KEY constraint 'UK1_MyTable'. Cannot insert duplicate key in object 'dbo.MyTable'.
The statement has been terminated.
My query looks something like:
INSERT INTO MyTable (FieldA, FieldB, FieldC)
SELECT FieldA='AValue', FieldB='BValue', FieldC='CValue'
WHERE (SELECT COUNT(*) FROM MyTable WHERE FieldA='AValue' AND FieldB='BValue' AND FieldC='CValue' ) = 0
The constraint 'UK1_MyConstraint' says that in MyTable, the combination of the 3 fields should be unique.
My questions:
- Why doesn't this work?
- What modification do I need to make so there is no chance of an exception due to the constraint violation?
Note that I'm aware that there are other approaches to solving the original problem of "INSERT if not exists" such as (in summary):
- Using TRY CATCH
- IF NOT EXIST INSERT (inside a transaction with serializable isolation)
Should I be using one of the approaches?
Edit 1 SQL for Creating Table:
CREATE TABLE [dbo].[MyTable](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[FieldA] [bigint] NOT NULL,
[FieldB] [int] NOT NULL,
[FieldC] [char](3) NULL,
[FieldD] [float] NULL,
CONSTRAINT [PK_MyTable] PRIMARY KEY NONCLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON),
CONSTRAINT [UK1_MyTable] UNIQUE NONCLUSTERED
(
[FieldA] ASC,
[FieldB] ASC,
[FieldC] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
Edit 2 Decision:
Just to update this - I've decided to use the "JFDI" implementation suggested in the linked question (link). Although I'm still curious as to why the original implementation doesn't work.
Solution 1:
Why doesn't this work?
I believe the default behaviour of SQL Server is to release shared locks as soon as they are no longer needed. Your sub-query will result in a short-lived shared (S) lock on the table, which will be released as soon as the sub-query completes.
At this point there is nothing to prevent a concurrent transaction from inserting the very row you just verified was not present.
What modification do I need to make so there is no chance of an exception due to the constraint violation?
Adding the HOLDLOCK
hint to your sub-query will instruct SQL Server to hold on to the lock until the transaction is completed. (In your case, this is an implicit transaction.) The HOLDLOCK
hint is equivalent to the SERIALIZABLE
hint, which itself is equivalent to the serializable transaction isolation level which you refer in your list of "other approaches".
The HOLDLOCK
hint alone would be sufficient to retain the S lock and prevent a concurrent transaction from inserting the row you are guarding against. However, you will likely find your unique key violation error replaced by deadlocks, occurring at the same frequency.
If you're retaining only an S lock on the table, consider a race between two concurrent attempts to insert the same row, proceeding in lockstep -- both succeed in acquiring an S lock on the table, but neither can succeed in acquiring the Exclusive (X) lock required to execute the insert.
Luckily there is another lock type for this exact scenario, called the Update (U) lock. The U lock is identical to an S lock with the following difference: whilst multiple S locks can be held simultaneously on the same resource, only one U lock may be held at a time. (Said another way, whilst S locks are compatible with each other (i.e. can coexist without conflict), U locks are not compatible with each other, but can coexist alongside S locks; and further along the spectrum, Exclusive (X) locks are not compatible with either S or U locks)
You can upgrade the implicit S lock on your sub-query to a U lock using the UPDLOCK
hint.
Two concurrent attempts to insert the same row in the table will now be serialized at the initial select statement, since this acquires (and holds) a U lock, which is not compatible with another U lock from the concurrent insertion attempt.
NULL values
A separate problem may arise from the fact that FieldC allows NULL values.
If ANSI_NULLS
is on (default) then the equality check FieldC=NULL
would return false, even in the case where FieldC is NULL (you must use the IS NULL
operator to check for null when ANSI_NULLS
is on). Since FieldC is nullable, your duplicate check will not work when inserting a NULL value.
To correctly deal with nulls you will need to modify your EXISTS sub-query to use the IS NULL
operator rather than =
when a value of NULL is being inserted. (Or you can change the table to disallow NULLs in all the concerned columns.)
SQL Server Books Online References
- Locking Hints
- Lock Compatibility Matrix
- ANSI_NULLS
Solution 2:
RE: "I'm still curious as to why the original implementation doesn't work."
Why would it work?
What is there to prevent two concurrent transactions being interleaved as follows?
Tran A Tran B
---------------------------------------------
SELECT COUNT(*)...
SELECT COUNT(*)...
INSERT ....
INSERT... (duplicate key violation).
The only time conflicting locks will be taken is at the Insert
stage.
To see this in SQL Profiler
Create Table Script
create table MyTable
(
FieldA int NOT NULL,
FieldB int NOT NULL,
FieldC int NOT NULL
)
create unique nonclustered index ix on MyTable(FieldA, FieldB, FieldC)
Then paste the below into two different SSMS windows. Take a note of the spids of the connections (x and y) and set up a SQL Profiler Trace capturing locking events and user error messages. Apply filters of spid=x or y and severity = 0 and then execute both scripts.
Insert Script
DECLARE @FieldA INT, @FieldB INT, @FieldC INT
SET NOCOUNT ON
SET CONTEXT_INFO 0x696E736572742074657374
BEGIN TRY
WHILE 1=1
BEGIN
SET @FieldA=( (CAST(GETDATE() AS FLOAT) - FLOOR(CAST(GETDATE() AS FLOAT))) * 24 * 60 * 60 * 300)
SET @FieldB = @FieldA
SET @FieldC = @FieldA
RAISERROR('beginning insert',0,1) WITH NOWAIT
INSERT INTO MyTable (FieldA, FieldB, FieldC)
SELECT FieldA=@FieldA, FieldB=@FieldB, FieldC=@FieldC
WHERE (SELECT COUNT(*) FROM MyTable WHERE FieldA=@FieldA AND FieldB=@FieldB AND FieldC=@FieldC ) = 0
END
END TRY
BEGIN CATCH
DECLARE @message VARCHAR(500)
SELECT @message = 'in catch block ' + ERROR_MESSAGE()
RAISERROR(@message,0,1) WITH NOWAIT
DECLARE @killspid VARCHAR(10)
SELECT @killspid = 'kill ' +CAST(SPID AS VARCHAR(4)) FROM sys.sysprocesses WHERE SPID!=@@SPID AND CONTEXT_INFO = (SELECT CONTEXT_INFO FROM sys.sysprocesses WHERE SPID=@@SPID)
EXEC ( @killspid )
END CATCH
Solution 3:
Off the top of my head, I have a feeling one or more of those columns accepts nulls. I would like to see the create statement for the table including the constraint.