I've been trying to model product variants and thought that I might need to use EAV. I might have been able to do it without EAV, but I'm concerned that I might have missed something. Here's my design:

enter image description here

Here's what I am trying to represent:

  1. A product can have 0 or more product variants (e.g. a t-shirt product may have size and color variants).
  2. A product variant can have 1 or more product variant options (e.g. the size variant can be small, medium, large).
  3. An SKU is comprised of 1 or more product variant options (the product_variant_option_combination table would contain all the possible combinations of `product_variant_options. So, if there were 3 sizes and 3 colors, there would be 3 * 3 = 9 combinations -- and each combination would be given its own SKU and price).
  4. A product can have 1 or more SKUs.

If the product doesn't have any variants, then just ignore product_variants, product_variant_options, and product_variant_option_combinations.

Is this design sound? Will I end up having problems querying this? Will it scale? Is it normalized?

UPDATE 1

@Edper:

If a product can have 0 or many (optional mode) product variants (e.g. size, color etc). Does it follow that a product variants can also have 0 or many products having that variant?

I don't think so. It is possible that a product like a "t-shirt" may have a "size" variant and another product like "pants" may also have a "size" variant, but I think that's just only happenstance. There is no need to make "size" only appear as one record because "size" might have different context.

The products I'm dealing with vary greatly and they are bound to have similarly named variants.

UPDATE 2:

Here's an example of how I see my data:

enter image description here

I've boxed the variant Size and its associated values. I want to make it clear that these are not considered to be duplicate data. The Size variant for the 3 products is just happenstance. There is no need to normalize this, I think. Each product can have 0 or more variants -- and they are unknown to me. I expect "duplicates" (though they aren't really duplicates as they always are in the context of a particular product -- so, Widget 1's "Size" variant is not the same as Widget 2's "Size" variant).

UPDATE 3:

I see now that, in my design, it is possible for a product to have multiple identical product_variants. I think that can be resolved by making product_variants.product_id and product_variants.name a composite key. This means that Widget 1 can only have "Size" variant once.

product_variant_options.product_variant_id product_variant_options.name would also need to be a composite key.

UPDATE 4:

By updating my product_variant_option_combinations to include product_variant_id (FK to product_variants.id) and enforcing a UNIQUE constraint with product_variant_option_combinations.sku_id and product_variant_option_combinations.product_variant_id, I think I was able to prevent the problem of having an SKU that is both "Small" and "Large". Is this right?

enter image description here

-- phpMyAdmin SQL Dump
-- version 4.1.14
-- http://www.phpmyadmin.net
--
-- Host: 127.0.0.1
-- Generation Time: Jul 30, 2014 at 03:35 AM
-- Server version: 5.6.17
-- PHP Version: 5.5.12

SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";


/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;

--
-- Database: `mydb`
--

-- --------------------------------------------------------

--
-- Table structure for table `products`
--

CREATE TABLE IF NOT EXISTS `products` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(45) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=2 ;

--
-- Dumping data for table `products`
--

INSERT INTO `products` (`id`, `name`) VALUES
(1, 'Widget 1');

-- --------------------------------------------------------

--
-- Table structure for table `product_variants`
--

CREATE TABLE IF NOT EXISTS `product_variants` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `product_id` int(11) NOT NULL,
  `name` varchar(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UNIQUE_product_id_name` (`product_id`,`name`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=3 ;

--
-- Dumping data for table `product_variants`
--

INSERT INTO `product_variants` (`id`, `product_id`, `name`) VALUES
(2, 1, 'Color'),
(1, 1, 'Size');

-- --------------------------------------------------------

--
-- Table structure for table `product_variant_options`
--

CREATE TABLE IF NOT EXISTS `product_variant_options` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `product_variant_id` int(11) NOT NULL,
  `name` varchar(45) NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UNIQUE_product_variant_id_name` (`product_variant_id`,`name`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `product_variant_options`
--

INSERT INTO `product_variant_options` (`id`, `product_variant_id`, `name`) VALUES
(2, 1, 'Large'),
(1, 1, 'Small'),
(4, 2, 'Black'),
(3, 2, 'White');

-- --------------------------------------------------------

--
-- Table structure for table `skus`
--

CREATE TABLE IF NOT EXISTS `skus` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `product_id` int(11) NOT NULL,
  `sku` varchar(45) NOT NULL,
  `price` decimal(10,2) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `skus_product_id_products_id_idx` (`product_id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `skus`
--

INSERT INTO `skus` (`id`, `product_id`, `sku`, `price`) VALUES
(1, 1, 'W1SSCW', '10.00'),
(2, 1, 'W1SSCB', '10.00'),
(3, 1, 'W1SLCW', '12.00'),
(4, 1, 'W1SLCB', '15.00');

-- --------------------------------------------------------

--
-- Table structure for table `skus_product_variant_options`
--

CREATE TABLE IF NOT EXISTS `skus_product_variant_options` (
  `sku_id` int(11) NOT NULL,
  `product_variant_id` int(11) NOT NULL,
  `product_variant_options_id` int(11) NOT NULL,
  PRIMARY KEY (`sku_id`,`product_variant_options_id`,`product_variant_id`),
  UNIQUE KEY `UNIQUE_sku_id_product_variant_id` (`sku_id`,`product_variant_id`),
  KEY `spvo_product_variant_options_id_pro_idx` (`product_variant_options_id`),
  KEY `spvo_product_variant_id_product_var_idx` (`product_variant_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

--
-- Dumping data for table `skus_product_variant_options`
--

INSERT INTO `skus_product_variant_options` (`sku_id`, `product_variant_id`, `product_variant_options_id`) VALUES
(1, 1, 1),
(2, 1, 1),
(3, 1, 2),
(4, 1, 2),
(1, 2, 3),
(3, 2, 3),
(2, 2, 4),
(4, 2, 4);

--
-- Constraints for dumped tables
--

--
-- Constraints for table `product_variants`
--
ALTER TABLE `product_variants`
  ADD CONSTRAINT `product_variants_product_id_products_id` FOREIGN KEY (`product_id`) REFERENCES `products` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION;

--
-- Constraints for table `product_variant_options`
--
ALTER TABLE `product_variant_options`
  ADD CONSTRAINT `product_variant_options_product_variant_id_product_variants_id` FOREIGN KEY (`product_variant_id`) REFERENCES `product_variants` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION;

--
-- Constraints for table `skus`
--
ALTER TABLE `skus`
  ADD CONSTRAINT `skus_product_id_products_id` FOREIGN KEY (`product_id`) REFERENCES `products` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION;

--
-- Constraints for table `skus_product_variant_options`
--
ALTER TABLE `skus_product_variant_options`
  ADD CONSTRAINT `skus_product_variant_options_sku_id_skus_id` FOREIGN KEY (`sku_id`) REFERENCES `skus` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  ADD CONSTRAINT `spvo_product_variant_options_id_product_variant_options_id` FOREIGN KEY (`product_variant_options_id`) REFERENCES `product_variant_options` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
  ADD CONSTRAINT `spvo_product_variant_id_product_variants_id` FOREIGN KEY (`product_variant_id`) REFERENCES `product_variants` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION;

/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;

Solution 1:

You could have a design like:

 +---------------+     +-------------------+
 | PRODUCTS      |-----< PRODUCT_VARIANTS  |
 +---------------+     +-------------------+
 | #product_id   |     | #product_id       |
 |  product_name |     | #variant_id       |
 +---------------+     |  sku_id           |
         |             +-------------------+
         |                       |
+--------^--------+     +--------^--------+
| PRODUCT_OPTIONS |-----< VARIANT_VALUES  |
+-----------------+     +-----------------+
| #product_id     |     | #product_id     |
| #option_id      |     | #variant_id     |
+--------v--------+     | #option_id      |
         |              |  value_id       |
+-----------------+     +--------v--------+
| OPTIONS         |              |
+-----------------+              |
| #option_id      |              |
|  option_name    |              |
+-----------------+              |
         |                       |
 +-------^-------+               |
 | OPTION_VALUES |---------------+
 +---------------+
 | #option_id    |
 | #value_id     |
 |  value_name   |
 +---------------+

With the Primary, Unique and Foreign Keys:

  • PRODUCTS
    • PK: product_id
    • UK: product_name
  • OPTIONS
    • PK: option_id
    • UK: option_name
  • OPTION_VALUES
    • PK: option_id, value_id
    • UK: option_id, value_name
    • FK: option_id REFERENCES OPTIONS (option_id)
  • PRODUCT_OPTIONS
    • PK: product_id, option_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
    • FK: option_id REFERENCES OPTIONS (option_id)
  • PRODUCT_VARIANTS
    • PK: product_id, variant_id
    • UK: sku_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
  • VARIANT_VALUES
    • PK: product_id, variant_id, option_id
    • FK: product_id, variant_id REFERENCES PRODUCT_VARIANTS (product_id, variant_id)
    • FK: product_id, option_id REFERENCES PRODUCT_OPTIONS (product_id, option_id)
    • FK: option_id, value_id REFERENCES OPTION_VALUES (option_id, value_Id)

You have:

  • PRODUCTS e.g. Shirt, Jumper, Trousers
  • OPTIONS e.g. Size, Colour, Length
  • OPTION_VALUES e.g Size - Small, Medium, Large; Colour - Red, White, Blue
  • PRODUCT_OPTIONS e.g. Shirt - Size, Colour; Trousers - Length, Colour

You then need to create a n-dimensional array, with the number of dimensions equalling the number of options for the product. Each element in the array corresponds to a product variant. There will always be at least one product variant for each product; as there is always the pseudo option of the product "as-is"

  • PRODUCT_VARIANTS e.g Shirt 1, Shirt 2
  • VARIANT_VALUES e.g Shirt 1: Small Red; Shirt 2: Small White

You may wish to have validation to ensure a SKU is not assigned unless values have been specified for all options associated with a product.

Based on the spreadsheet of how you see your data you could enter data in your tables as follows:

PRODUCTS
========
id  name
--- --------
1   Widget 1
2   Widget 2
3   Widget 3
 
PRODUCT_VARIANTS
================
id  product_id name
--- ---------- ------
1   1          Size   (Widget 1)
2   1          Color  (Widget 1)
3   2          Size   (Widget 2)
4   3          Class  (Widget 3)
5   3          Size   (Widget 3)
 
PRODUCT_VARIANT_OPTIONS
=======================
id  product_variant_id name
--- ------------------ -------------
1   1                  Small         (Widget 1; Size)
2   1                  Large         (Widget 1; Size)
3   2                  White         (Widget 1; Color)
4   2                  Black         (Widget 1; Color)
5   3                  Small         (Widget 2; Size)
6   3                  Medium        (Widget 2; Size)
7   4                  Amateur       (Widget 3; Class)
8   4                  Professional  (Widget 3; Class)
9   5                  Medium        (Widget 3; Size)
10  5                  Large         (Widget 3; Size)
 
SKUS
====
id  product_id sku    price
--- ---------- ------ -----
1   1          W1SSCW    10 (Widget 1)
2   1          W1SSCB    10 (Widget 1)
3   1          W1SLCW    12 (Widget 1)
4   1          W1SLCB    15 (Widget 1)
5   2          W2SS     100 (Widget 2)
6   2          W2SM     100 (Widget 2)
7   3          W3CASM    50 (Widget 3)
8   3          W3CASL    50 (Widget 3)
9   3          W3CPSM   150 (Widget 3)
10  3          W3CPSL   160 (Widget 3)
 
PRODUCT_VARIANT_OPTION_COMBINATIONS
===================================
product_variant_option_id sku_id
------------------------- ------
1                         1      (W1SSCW; Size; Small)
3                         1      (W1SSCW; Color; White)
1                         2      (W1SSCB; Size; Small)
4                         2      (W1SSCB; Color; Black)
2                         3      (W1SLCW; Size; Large)
3                         3      (W1SLCW;  Color; White)
2                         4      (W1SLCB; Size; Large)
4                         4      (W1SLCB; Color; Black)
5                         5      (W2SS; Size; Small)
6                         6      (W2SM; Size; Medium)
7                         7      (W3CASM; Class; Amateur)
9                         7      (W3CASM; Size; Medium)
7                         8      (W3CASL; Class; Amateur)
10                        8      (W3CASL; Size; Large)
8                         9      (W3CPSM; Class; Professional)
9                         9      (W3CPSM; Size; Medium)
8                         10     (W3CPSL; Class; Professional)
10                        10     (W3CPSL; Size; Large)

There seems to be nothing in your design from stopping the addition of the entry of the record (product_variant_option_id: 2; sku_id 1) so that SKU W1SSCW has now both the options of Small and Large. There is nothing to stop the entry of the record (product_variant_option_id: 7; sku_id: 1) so that SKU W1SSCW also has the option Amateur.

Based on the spreadsheet of how you see your data you could enter data in my tables as follows:

PRODUCTS
========
product_id product_name
---------- ------------
1          Widget 1
2          Widget 2
3          Widget 3
 
OPTIONS
=======
option_id option_name
--------- -----------
1         Size SL
2         Color
3         Size SM
4         Class
5         Size ML
 
OPTION_VALUES
=============
option_id value_id value_name
--------- -------- ------------
1         1        Small        (Size SL)
1         2        Large        (Size SL)
2         1        White        (Color)
2         2        Black        (Color)
3         1        Small        (Size SM)
3         2        Medium       (Size SM)
4         1        Amateur      (Class)
4         2        Professional (Class)
5         1        Medium       (Size ML)
5         2        Large        (Size ML)
 
PRODUCT_OPTIONS
===============
product_id option_id
---------- ---------
1          1         (Widget 1; Size SL)
1          2         (Widget 1; Color)
2          3         (Widget 2; Size SM)
3          4         (Widget 3; Class)
3          5         (Widget 4; Size ML)
 
PRODUCT_VARIANTS
================
product_id variant_id sku_id
---------- ---------- ------
1          1          W1SSCW (Widget 1)
1          2          W1SSCB (Widget 1)
1          3          W1SLCW (Widget 1)
1          4          W1SLCB (Widget 1)
2          1          W2SS   (Widget 2)
2          2          W2SM   (Widget 2)
3          1          W3CASM (Widget 3)
3          2          W3CASL (Widget 3)
3          3          W3CPSM (Widget 3)
3          4          W3CPSL (Widget 3)
 
VARIANT_VALUES
==============
product_id variant_id option_id value_id
---------- ---------- --------- --------
1          1          1         1        (W1SSCW; Size SL; Small)
1          1          2         1        (W1SSCW; Color; White)
1          2          1         1        (W1SSCB; Size SL; Small)
1          2          2         2        (W1SSCB; Color; Black)
1          3          1         2        (W1SLCW; Size SL; Large)
1          3          2         1        (W1SLCW; Color; White)
1          4          1         2        (W1SLCB; Size SL; Large)
1          4          2         2        (W1SLCB; Color; Black)
2          1          3         1        (W2SS; Size SM; Small)
2          2          3         2        (W2SM; Size SM; Medium)
3          1          4         1        (W3CASM; Class; Amateur)
3          1          5         1        (W3CASM; Size ML; Medium)
3          2          4         1        (W3CASL; Class; Amateur)
3          2          5         2        (W3CASL; Size ML; Large)
3          3          4         2        (W3CPSM; Class; Professional)
3          3          5         1        (W3CPSM; Size ML; Medium)
3          4          4         2        (W3CPSL; Class; Professional)
3          4          5         2        (W3CPSL; Size ML; Large)

In my design you could not enter the additional VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 1; value_id: 2) - so that SKU W1SSCW has now both the options of Small and Large - due to the primary key on VARIANT_VALUES and the existing VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 1; value_id: 1). In my design you could not enter the VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 4; value_id: 1) - so that SKU W1SSCW also has the option Amateur - due to the foreign key referencing PRODUCT_OPTIONS and the lack of a record in this table of (product_id: 1; option_id: 4) indicating that Class is a valid option for product Widget 1.

EDIT: Design with no PRODUCT_OPTIONS table

You could have a design like:

+---------------+     +---------------+
| PRODUCTS      |-----< PRODUCT_SKUS  |
+---------------+     +---------------+
| #product_id   |     | #product_id   |
|  product_name |     | #sku_id       |
+---------------+     |  sku          |
        |             |  price        |
        |             +---------------+
        |                     |
+-------^-------+      +------^------+
| OPTIONS       |------< SKU_VALUES  |
+---------------+      +-------------+
| #product_id   |      | #product_id |
| #option_id    |      | #sku_id     |
|  option_name  |      | #option_id  |
+---------------+      |  value_id   |
        |              +------v------+
+-------^-------+             |
| OPTION_VALUES |-------------+
+---------------+
| #product_id   |
| #option_id    |
| #value_id     |
|  value_name   |
+---------------+

With the Primary, Unique and Foreign Keys:

  • PRODUCTS
    • PK: product_id
    • UK: product_name
  • OPTIONS
    • PK: product_id, option_id
    • UK: product_id, option_name
  • OPTION_VALUES
    • PK: product_id, option_id, value_id
    • UK: product_id, option_id, value_name
    • FK: product-id, option_id REFERENCES OPTIONS (product_id, option_id)
  • PRODUCT_SKUS
    • PK: product_id, sku_id
    • UK: sku_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
  • SKU_VALUES
    • PK: product_id, sku_id, option_id
    • FK: product_id, sku_id REFERENCES PRODUCT_SKUS (product_id, sku_id)
    • FK: product_id, option_id REFERENCES OPTIONS (product_id, option_id)
    • FK: product_id, option_id, value_id REFERENCES OPTION_VALUES (product_id, option_id, value_id)

Based on the spreadsheet of how you see your data you could enter data in these tables as follows:

PRODUCTS
========
product_id product_name
---------- ------------
1          Widget 1
2          Widget 2
3          Widget 3
 
OPTIONS
=======
product_id option_id option_name
---------- --------- -----------
1          1         Size        (Widget 1)
1          2         Color       (Widget 1)
2          1         Size        (Widget 2)
3          1         Class       (Widget 3)
3          2         Size        (Widget 3)
 
OPTION_VALUES
=============
product_id option_id value_id value_name
---------- --------- -------- ------------
1          1         1        Small        (Widget1; Size)
1          1         2        Large        (Widget1; Size)
1          2         1        White        (Widget1; Color)
1          2         2        Black        (Widget1; Color)
2          1         1        Small        (Widget2; Size)
2          1         2        Medium       (Widget2; Size)
3          1         1        Amateur      (Widget3; Class)
3          1         2        Professional (Widget3; Class)
3          2         1        Medium       (Widget3; Size)
3          2         2        Large        (Widget3; Size)
 
PRODUCT_SKUS
============
product_id sku_id sku
---------- ------ ------
1          1      W1SSCW (Widget 1)
1          2      W1SSCB (Widget 1)
1          3      W1SLCW (Widget 1)
1          4      W1SLCB (Widget 1)
2          1      W2SS   (Widget 2)
2          2      W2SM   (Widget 2)
3          1      W3CASM (Widget 3)
3          2      W3CASL (Widget 3)
3          3      W3CPSM (Widget 3)
3          4      W3CPSL (Widget 3)
 
SKU_VALUES
==========
product_id sku_id option_id value_id
---------- ------ --------- --------
1          1      1         1        (W1SSCW; Size; Small)
1          1      2         1        (W1SSCW; Color; White)
1          2      1         1        (W1SSCB; Size; Small)
1          2      2         2        (W1SSCB; Color; Black)
1          3      1         2        (W1SLCW; Size; Large)
1          3      2         1        (W1SLCW; Color; White)
1          4      1         2        (W1SLCB; Size; Large)
1          4      2         2        (W1SLCB; Color; Black)
2          1      1         1        (W2SS; Size; Small)
2          2      1         2        (W2SM; Size; Medium)
3          1      1         1        (W3CASM; Class; Amateur)
3          1      2         1        (W3CASM; Size; Medium)
3          2      1         1        (W3CASL; Class; Amateur)
3          2      2         2        (W3CASL; Size; Large)
3          3      1         2        (W3CPSM; Class; Professional)
3          3      2         1        (W3CPSM; Size; Medium)
3          4      1         2        (W3CPSL; Class; Professional)
3          4      2         2        (W3CPSL; Size; Large)