Relational database design question - Surrogate-key or Natural-key?

Solution 1:

I believe that in practice, using a natural key is rarely the best option. I would probably go for the surrogate key approach as in your first example.

The following are the main disadvantages of the natural key approach:

  • You might have an incorrect type name, or you may simply want to rename the type. To edit it, you would have to update all the tables that would be using it as a foreign key.

  • An index on an int field will be much more compact than one on a varchar field.

  • In some cases, it might be difficult to have a unique natural key, and this is necessary since it will be used as a primary key. This might not apply in your case.

Solution 2:

The first one is more future proof, because it allows you to change the string representing the type without updating the whole user table. In other words you use a surrogate key, an additional immutable identifier introduced for the sake of flexibility.

Solution 3:

A good reason to use a surrogate key (instead of a natural key like name) is when the natural key isn't really a good choice in terms of uniqueness. In my lifetime i've known no fewer than 4 "Chris Smith"s. Person names are not unique.

Solution 4:

I prefer to use the surrogate key. It is often people will identity and use the natural key which will be fine for a while, until they decide they want to change the value. Then problems start.

Solution 5:

You should probably always use an ID number (that way if you change the type name, you don't need to update the user table) it also allows you to keep your datasize down, as a table full of INTs is much smaller than one full of 45 character varchars.