Entity Attribute Value Database vs. strict Relational Model Ecommerce
It is safe to say that the EAV/CR database model is bad. That said,
Question: What database model, technique, or pattern should be used to deal with "classes" of attributes describing e-commerce products which can be changed at run time?
In a good E-commerce database, you will store classes of options (like TV resolution then have a resolution for each TV, but the next product may not be a TV and not have "TV resolution"). How do you store them, search efficiently, and allow your users to setup product types with variable fields describing their products? If the search engine finds that customers typically search for TVs based on console depth, you could add console depth to your fields, then add a single depth for each tv product type at run time.
There is a nice common feature among good e-commerce apps where they show a set of products, then have "drill down" side menus where you can see "TV Resolution" as a header, and the top five most common TV Resolutions for the found set. You click one and it only shows TVs of that resolution, allowing you to further drill down by selecting other categories on the side menu. These options would be the dynamic product attributes added at run time.
Further discussion:
So long story short, are there any links out on the Internet or model descriptions that could "academically" fix the following setup? I thank Noel Kennedy for suggesting a category table, but the need may be greater than that. I describe it a different way below, trying to highlight the significance. I may need a viewpoint correction to solve the problem, or I may need to go deeper in to the EAV/CR.
Love the positive response to the EAV/CR model. My fellow developers all say what Jeffrey Kemp touched on below: "new entities must be modeled and designed by a professional" (taken out of context, read his response below). The problem is:
- entities add and remove attributes weekly
(search keywords dictate future attributes) - new entities arrive weekly
(products are assembled from parts) - old entities go away weekly
(archived, less popular, seasonal)
The customer wants to add attributes to the products for two reasons:
- department / keyword search / comparison chart between like products
- consumer product configuration before checkout
The attributes must have significance, not just a keyword search. If they want to compare all cakes that have a "whipped cream frosting", they can click cakes, click birthday theme, click whipped cream frosting, then check all cakes that are interesting knowing they all have whipped cream frosting. This is not specific to cakes, just an example.
There's a few general pros and cons I can think of, there are situations where one is better than the other:
Option 1, EAV Model:
- Pro: less time to design and develop a simple application
- Pro: new entities easy to add (might even be added by users?)
- Pro: "generic" interface components
- Con: complex code required to validate simple data types
- Con: much more complex SQL for simple reports
- Con: complex reports can become almost impossible
- Con: poor performance for large data sets
Option 2, Modelling each entity separately:
- Con: more time required to gather requirements and design
- Con: new entities must be modelled and designed by a professional
- Con: custom interface components for each entity
- Pro: data type constraints and validation simple to implement
- Pro: SQL is easy to write, easy to understand and debug
- Pro: even the most complex reports are relatively simple
- Pro: best performance for large data sets
Option 3, Combination (model entities "properly", but add "extensions" for custom attributes for some/all entities)
- Pro/Con: more time required to gather requirements and design than option 1 but perhaps not as much as option 2 *
- Con: new entities must be modelled and designed by a professional
- Pro: new attributes might be easily added later on
- Con: complex code required to validate simple data types (for the custom attributes)
- Con: custom interface components still required, but generic interface components may be possible for the custom attributes
- Con: SQL becomes complex as soon as any custom attribute is included in a report
- Con: good performance generally, unless you start need to search by or report by the custom attributes
* I'm not sure if Option 3 would necessarily save any time in the design phase.
Personally I would lean toward option 2, and avoid EAV wherever possible. However, for some scenarios the users need the flexibility that comes with EAV; but this comes with a great cost.
It is safe to say that the EAV/CR database model is bad.
No, it's not. It's just that they're an inefficient usage of relational databases. A purely key/value store works great with this model.
Now, to your real question: How to store various attributes and keep them searchable?
Just use EAV. In your case it would be a single extra table. index it on both attribute name and value, most RDBMs would use prefix-compression to on the attribute name repetitions, making it really fast and compact.
EAV/CR gets ugly when you use it to replace 'real' fields. As with every tool, overusing it is 'bad', and gives it a bad image.
// At this point, I'd like to take a moment to speak to you about the Magento/Adobe PSD format. // Magento/PSDis not a good ecommerce platform/format. Magento/PSDis not even a bad ecommerce platform/format. Calling it such would be an // insult to other bad ecommerce platform/formats, such as Zencart or OsCommerce. No, Magento/PSDis an abysmal ecommerce platform/format. Having // worked on this code for several weeks now, my hate for Magento/PSDhas grown to a raging fire // that burns with the fierce passion of a million suns.
http://code.google.com/p/xee/source/browse/trunk/XeePhotoshopLoader.m?spec=svn28&r=11#107
The internal models are wacky at best, like someone put the schema into a boggle game, sealed that and put it in a paint shacker...
Real world: I'm working on a midware fulfilment app and here are one the queries to get address information.
CREATE OR REPLACE VIEW sales_flat_addresses AS
SELECT sales_order_entity.parent_id AS order_id,
sales_order_entity.entity_id,
CONCAT(CONCAT(UCASE(MID(sales_order_entity_varchar.value,1,1)),MID(sales_order_entity_varchar.value,2)), "Address") as type,
GROUP_CONCAT(
CONCAT( eav_attribute.attribute_code," ::::: ", sales_order_entity_varchar.value )
ORDER BY sales_order_entity_varchar.value DESC
SEPARATOR '!!!!!'
) as data
FROM sales_order_entity
INNER JOIN sales_order_entity_varchar ON sales_order_entity_varchar.entity_id = sales_order_entity.entity_id
INNER JOIN eav_attribute ON eav_attribute.attribute_id = sales_order_entity_varchar.attribute_id
AND sales_order_entity.entity_type_id =12
GROUP BY sales_order_entity.entity_id
ORDER BY eav_attribute.attribute_code = 'address_type'
Exacts address information for an order, lazily
--
Summary: Only use Magento if:
- You are being given large sacks of money
- You must
- Enjoy pain