How do I lock on an InnoDB row that doesn't exist yet?

How can I guarantee that I can search if a username exists in my database, then insert that username into the database as a new row without any intercept between the SELECT and INSERT statements?

Almost as if I am locking on a row that doesn't exist. I want to lock on the non-existent row with the username "Foo", so that I can now check if it exists in the database AND insert it into the database if it doesn't already exist without any interruption.

I know that using LOCK IN SHARE MODE and FOR UPDATE exist but as far as I know, that only works on rows that already exist. I am not sure what to do in this situation.


Solution 1:

While the answer above is true in that a SELECT ... FOR UPDATE will prevent concurrent sessions / transactions from inserting the same record, that is not the full truth. I am currently fighting with the same problem and have come to the conclusion that the SELECT ... FOR UPDATE is nearly useless in that situation for the following reason:

A concurrent transaction / session can also do a SELECT ... FOR UPDATE on the very same record / index value, and MySQL will happily accept that immediately (non-blocking) and without throwing errors. Of course, as soon as the other session has done that, your session as well can't insert the record any more. Nor your nor the other session / transaction get any information about the situation and think they can safely insert the record until they actually try to do so. Trying to insert then either leads to a deadlock or to a duplicate key error, depending on circumstances.

In other words, SELECT ... FOR UPDATE prevents other sessions from inserting the respective record(s), BUT even if you do a SELECT ... FOR UPDATE and the respective record is not found, chances are that you can't actually insert that record. IMHO, that renders the "first query, then insert" method useless.

The cause of the problem is that MySQL does not offer any method to really lock non-existent records. Two concurrent sessions / transactions can lock non-existent records "FOR UPDATE" at the same time, a thing which really should not be possible and which makes development significantly more difficult.

The only way to work around this seems to be using semaphore tables or locking the whole table when inserting. Please refer to the MySQL documentation for further reference on locking whole tables or using semaphore tables.

Just my 2 cents ...

Solution 2:

If there is an index on username (which should be the case, if not, add one, and preferably a UNIQUE one), then issuing a SELECT * FROM user_table WHERE username = 'foo' FOR UPDATE; will prevent any concurrent transaction from creating this user (as well as the "previous" and the "next" possible value in case of a non-unique index).

If no suitable index is found (to meet the WHERE condition), then an efficient record-locking is impossible and the whole table becomes locked*.

This lock will be held until the end of the transaction that issued the SELECT ... FOR UPDATE.

Some very interesting information on this topic can be found in these manual pages.

* I say efficient, because in fact a record lock is actually a lock on index records. When no suitable index is found, only the default clustered index can be used, and it will be locked in full.

Solution 3:

Locking on nonexistent record does not work in MySQL. There are several bug reports about it:

  • SELECT ... FOR UPDATE doesn't do exclusive lock when table is empty
  • Add predicate locking to avoid deadlocks due to locking non-existent rows

One workaround is to use a mutex table, where an existing record will be locked before the new record is inserted. For example, there are two tables: sellers and products. A seller has many products, but should not have any duplicate products. In this case, sellers table can be used as mutex table. Before a new product is inserted, a lock will be created on the seller’s record. With this additional query, it is guaranteed that only one thread can perform the action at any given time. No duplicate. No deadlock.

Solution 4:

You are "normalizing"? That is, the table is a list of pairs of ids and names? And you are inserting a new "name" (and presumably want the id for use in other tables)?

Then have UNIQUE(name) and do

INSERT IGNORE INTO tbl (name) VALUES ($name);

That does not explain how to the the id just created, but you did not ask about that.

Be aware that the "new" id is allocated before discovering whether it is needed. So this could lead to rapidly increasing AUTO_INCREMENT values.

See also

 INSERT ... ON DUPLICATE KEY UPDATE ...

and tricks to use with VALUES() and LAST_INSERT_ID(id). But, again, you have not stated the real purpose in the Question, so I don't want to unnecessarily branch into further details.

Note: The above don't care what the value of autocommit or whether the statement is inside an explicit transaction.

For normalizing a batch of 'names' all at once, the 2 SQLs given here are quite efficient: http://mysql.rjweb.org/doc.php/staging_table#normalization And the technique avoids 'burning' ids and avoids any runtime errors.