I have some questions about database design.

  1. Is there a name for this?
  2. Is it good practice?
  3. Any performance considerations?

I have a generic table structure used to store relationships.

Recently I refactored some things to use this generic structure instead of direct Fk columns, but now I'm not sure if that was really the best idea.

Original Schema:


 +------------------+       +---------------------+    +----------------------+
 | Book             |       | Note                |    | MetaParent           |
 |------------------|       |---------------------|    |----------------------|
 | Id               |       | Id                  |    | Id                   |
 | NoteId           |       | MetaParentId:(Null) |    | MetaTableId          |
 |                  +-------+                     +----+ KeyValue             |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 +------------------+       +---------------------+    +----------------------+

New Schema


 +------------------+       +---------------------+    +----------------------+
 | Book             |       | Note                |    | MetaParent           |
 |------------------|       |---------------------|    |----------------------|
 | Id               |       | Id                  |    | Id                   |
 |                  |       | MetaParentId:(Null) |    | MetaTableId          |
 |                  +       +                     +----+ KeyValue             |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 |                  |       |                     |    |                      |
 +------------------+       +---------------------+    +----------------------+

So basically instead of having a direct Fk relationship between Book and Note we have an indirect relationship via the MetaParent table using the MetaTableId/KeyValue columns.

Currently the MetaParent table has about 500k records and things are running acceptably. But we do rebuild the indexes on it every night.

My concerns are that now the relationship between Book and Note isn't obvious. You have to know one exists and to use the MetaParent table.

Also performance, I'm not sure at what point we'd run into issues with joins against MetaTableId/KeyValue running too slow. It seems the more you add to this table the slower queries would get.


Solution 1:

You should always enforce referential integrity by using "normal" FOREIGN KEYs.

In a nutshell, FOREIGN KEYs have the following advantages:

  1. They are already implemented within the DBMS.
  2. They are declarative, self-documenting and "obvious".
  3. They cannot be bypassed (unless explicitly disabled or dropped).
  4. They are correct.
  5. They are fast.
  6. They support cascading referential actions (such as ON DELETE CASCADE).
  7. The DBMS knows the data is related, allowing it to find a better query plan in some cases .
  8. If you are using an ORM tool, it can automatically generate references between objects.

And here are the corresponding disadvantages of enforcing referential integrity in the application code:

  1. You are duplicating the work that has already been done.
  2. It's imperative, probably "buried" deep in your application source code, and harder to maintain.
  3. A single client application that has a bug can break the referential integrity (and corrupt the data).
  4. You are likely to implement them incorrectly in your application code. It looks simple from the outset, but in a concurrent environment, it is easy to introduce race conditions.
  5. Even if you have implemented them correctly, you probably used some form of locking to avoid race conditions, which is likely to be slower / less scalable than specially optimized FKs built into the DBMS.
  6. You have to implement cascading yourself.
  7. The DBMS doesn't know the data is related, which may produce sub-optimal query plan.
  8. You may need to do more manual work in your ORM tool of choice.

Is there a name for this?

Not that I know of. I heard a term "generic FKs" being used, but that's probably not universal.

Is it good practice?

No (see above).

Any performance considerations?

Yes (see above).