Difference between two table structure
I am very confuse about the two structure. What are the advantage and disadvantage of this two table? Which one is better and why?
TABLE1
id, name, age, birthdate, address
somedata1 somedata1 somedata1 somedata1 somedata1
somedata2 somedata2 somedata2 somedata2 somedata2
somedata3 somedata3 somedata3 somedata3 somedata3
TABLE2
id, col_name, col_value
somedata name somedata
somedata age somedata
somedata birthdate somedata
somedata address somedata
somedata2 name somedata2
somedata2 age somedata2
somedata2 birthdate somedata2
somedata2 address somedata2
somedata3 name somedata3
somedata3 age somedata3
somedata3 birthdate somedata3
somedata3 address somedata3
Solution 1:
Antipattern?
In common case, second table is anti-pattern in context of database design. And, even more, it has specific name: Entity-Attribute-Value (EAV). There are some cases, when using this design is justified, but that are rare cases - and even there it can be avoided.
Why EAV is bad
Data integrity support
Despite the fact, that such structure seems to be more "flexible" or "advanced", this design has weakness.
- Impossible to make mandatory attributes. You can not make some attribute mandatory, since attribute is now stored as a row - and the only sign that attribute is not set - is that the corresponding row absent in table. SQL will not allow you to build such constraint natively - thus, you'll have to check that in application - and, yes, query your table each time
- Mixing of data types. You will not be able to use SQL standard data types. Because your value column must be a "super-type" for all stored values in it. That means - you'll have in general to store all data as raw strings. Then you'll see how painful is to work with dates as with strings, casting data types each time, checking data integrity, e t.c.
- Impossible to enforce referential intregrity. In normal situation, you can use foreign key to restrict your values by those, which are defined in parent table. But not in this case - that's because referential integrity is applied to each row in table, but not for row values. So - you'll loose this advantage - and it's one of fundamental in relation DB
-
Impossible to set attributes names. That means - you can't restrict attribute name on DB level properly. For example, you'll write
"customer_name"
as attribute name in first case - and another developer will forget that and use"name_of_customer"
. And.. it's ok, DB will pass that and you'll end with hours spent on debugging this case.
Row reconstruction
In addition, row reconstruction will be awful in common case. If you have, for example, 5 attributes - that will be 5 self-table JOIN
-s. Too bad for such simple - at first glance - case. So I don't want even imagine how you'll maintain 20 attributes.
Can it be justified?
My point is - no. In RDBMS there will always be a way to avoid this. It's horrible. And if EAV is intended to be used, then best choice may be non-relational databases.
Solution 2:
in second case (table2) this is complex and take much time to find data when we make query for it. this case is used when you don't know about number of columns or they are varies, if you have fixed length of columns then used first case(table1) because in this case data find fast way.
Solution 3:
The table with columns id
, name
, age
, birthdate
, address
is what you use when you know before deployment, what information to store about an entity.
The table with columns id
, col_name
, col_value
can be used if you only know after deployment, what information to store about an entity (for example if non-technical people should be able to define fields that they whish to capture). It is less efficient, but lets you define new fields without changing the database schema.