MySQL ENUM type vs join tables
My requirement
A table needs to maintain a status column.
This column represents one of 5 states.
initial design
I figured I can just make it an integer column and represent the states using a numeric value.
- 0 = start
- 1 = running
- 2 = crashed
- 3 = paused
- 4 = stopped
Since I don't want my app to maintain the mapping from the integers to their string description, I plan to place those in a separate state description table (relying on a FK relation).
Then I discovered that MySQL has an ENUM type which matches my requirement exactly. Other than a direct dependency on MySQL, are there any pitfalls with using the ENUM type?
Changing the set of values in an ENUM requires an
ALTER TABLE
which might cause a table restructure -- an incredibly expensive operation (the table restructure doesn't happen if you simply add one new value to the end of the ENUM definition, but if you delete one, or change the order, it does a table restructure). Whereas Changing the set of values in a lookup table is as simple as INSERT or DELETE.There's no way to associate other attributes with the values in an ENUM, like which ones are retired, and which ones are eligible to be put in a drop-down list in your user interface. However, a lookup table can include additional columns for such attributes.
It's very difficult to query an ENUM to get a list of distinct values, basically requiring you to query the data type definition from
INFORMATION_SCHEMA
, and parsing the list out of the BLOB returned. You could trySELECT DISTINCT status
from your table, but that only gets status values currently in use, which might not be all values in the ENUM. However, if you keep values in a lookup table, it's easy to query, sort, etc.
I'm not a big fan of ENUM, as you can tell. :-)
The same applies to CHECK constraints that simply compare a column to a fixed set of values. Though MySQL doesn't support CHECK constraints anyway.
Update: MySQL 8.0.16 now implements CHECK constraints.
Here is article about speed comparison of enum. Maybe it gives some hints. IMHO it should be limited to a use in fixed list of strings ("Yes/No", "Child/Adult") that with 99% probability doesn't change in the future.
Enums in MySQL are bad for the already explained reasons.
I can add the following fact: Enum does not ensure any kind of validation on the server side. If you insert a row with a value which does not exist in the enum definition, you'll get a nice <empty>
or NULL
value in the DB, depending on NULL-ability of the enum field declaration.
My point about tinyints:
- enums are limited to 65535 values
- if you don't need more than 256 values, tinyint will take less space for each row, and its behavior is much more "predictable".
If you have lots of data in your DB ( more data then you have RAM ) and you ENUM values are NEVER going to change, I would go with ENUM, rather than the join. It should be faster.
Think about it, in the join case, you need an index on your foreign key and index on your primary key in the other table. As Riho said, see the benchmarks.