Inserting and selecting UUIDs as binary(16)

I don't understand why

SELECT UUID();

Returns something like:

3f06af63-a93c-11e4-9797-00505690773f

But if I insert it into a binary(16) field (the UUID() function) with for instance a BEFORE INSERT trigger and run a select, it returns something like:

0782ef48-a439-11

Note that these two UUIDs are not the same data.

I realize binary and an UUID string doesn't look identical, but shouldn't the selected data at least be just as long? Otherwise how can it possibly be equally likely to be unique?

Is it better to store it as char(36)? I just need it to be unique to prevent duplicate inserts. It is never selected or used for joins.

EDIT:

before trigger would be like:

BEGIN

if NEW.UUID IS NULL THEN

NEW.UUID = UUID();

END IF

END

Solution 1:

So, as a response to comments. The correct way to store a 36-char UUID as binary(16) is to perform the insert in a manner like:

INSERT INTO sometable (UUID) VALUES
       (UNHEX(REPLACE("3f06af63-a93c-11e4-9797-00505690773f", "-","")))

UNHEX because an UUID is already a hexed value. We trim (REPLACE) the dashes in the statement to bring the length down to 32 characters (our 16 bytes represented as HEX). You can do this at any point before storing it, obviously, so it doesn't have to be handled by the database.

You may retrieve the UUID like this:

SELECT HEX(UUID) FROM sometable;

Just in case someone comes across this thread and is unsure how this works.

And remember: If you're selecting a row using the UUID, use UNHEX() on the condition:

SELECT * FROM sometable WHERE UUID = UNHEX('3f06af63a93c11e4979700505690773f');

or literal notation (as mentioned by Alexis Wilke):

SELECT * FROM sometable WHERE UUID = 0x3f06af63a93c11e4979700505690773f;

And NOT HEX()on the column:

SELECT * FROM sometable WHERE HEX(UUID) = '3f06af63a93c11e4979700505690773f';

The last solution, while it works, requires that MySQL HEXes all UUIDs before it can determine which rows match. It's very inefficient.

Edit: If you're using MySQL 8 you should have a look at the UUID functions as mentioned in SlyDave's answer. This answer is still correct, but it doesn't optimise the UUID indexes which can be done natively using those functions. If you're on < MySQL 8 you can implement Devon's polyfill, which provides identical functionality on previous versions of MySQL.

Solution 2:

As of MySQL 8 you can use two new UUID functions:

  • BIN_TO_UUID

    SELECT BIN_TO_UUID(uuid, true) AS uuid FROM foo;
    -- 3f06af63-a93c-11e4-9797-00505690773f
    
  • UUID_TO_BIN

    INSERT INTO foo (uuid) VALUES (UUID_TO_BIN('3f06af63-a93c-11e4-9797-00505690773f', true));
    

This method also supports rearranging the time component of the uuid to enhance indexing performance (by ordering it chronologically), simply set the second argument to true - this only works for UUID1.

If you are using the true on UUID_TO_BIN flag for indexing performance (recommended), you must also set it on BIN_TO_UUID otherwise it won't convert back properly.

See the documentation for further details.

  • https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
  • http://mysqlserverteam.com/mysql-8-0-uuid-support/

Solution 3:

Polyfill for BIN_TO_UUID and UUID_TO_BIN for MySQL 5 with the swap_flag parameter.

DELIMITER $$

CREATE FUNCTION BIN_TO_UUID(b BINARY(16), f BOOLEAN)
RETURNS CHAR(36)
DETERMINISTIC
BEGIN
   DECLARE hexStr CHAR(32);
   SET hexStr = HEX(b);
   RETURN LOWER(CONCAT(
        IF(f,SUBSTR(hexStr, 9, 8),SUBSTR(hexStr, 1, 8)), '-',
        IF(f,SUBSTR(hexStr, 5, 4),SUBSTR(hexStr, 9, 4)), '-',
        IF(f,SUBSTR(hexStr, 1, 4),SUBSTR(hexStr, 13, 4)), '-',
        SUBSTR(hexStr, 17, 4), '-',
        SUBSTR(hexStr, 21)
    ));
END$$


CREATE FUNCTION UUID_TO_BIN(uuid CHAR(36), f BOOLEAN)
RETURNS BINARY(16)
DETERMINISTIC
BEGIN
  RETURN UNHEX(CONCAT(
  IF(f,SUBSTRING(uuid, 15, 4),SUBSTRING(uuid, 1, 8)),
  SUBSTRING(uuid, 10, 4),
  IF(f,SUBSTRING(uuid, 1, 8),SUBSTRING(uuid, 15, 4)),
  SUBSTRING(uuid, 20, 4),
  SUBSTRING(uuid, 25))
  );
END$$

DELIMITER ;

--
-- Tests to demonstrate that it works correctly. These are the values taken from
-- https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin
--
-- If you run these SELECTs using the above functions, the 
-- output of the two columns should be exactly identical in all four cases.
SET @uuid = '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT HEX(UUID_TO_BIN(@uuid, 0)), '6CCD780CBABA102695645B8C656024DB';
SELECT HEX(UUID_TO_BIN(@uuid, 1)), '1026BABA6CCD780C95645B8C656024DB';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,0),0), '6ccd780c-baba-1026-9564-5b8c656024db';
SELECT BIN_TO_UUID(UUID_TO_BIN(@uuid,1),1), '6ccd780c-baba-1026-9564-5b8c656024db';

Included are the SELECT samples from https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_uuid-to-bin that demonstrate that the above code returns the exact same results as the 8.0 function. These functions are considered DETERMINISTIC as they always produce the same output for a given input. See https://dev.mysql.com/doc/refman/8.0/en/create-procedure.html