Horizontally expanded table created dynamically instead of many joins for performance gain
Consider a products table, where some info like TITLE and BRAND are kept in a huge "translations" table in multiple languages.
tblTranslations
ROW_ID
COL_NAME
LANG
VALUE
Since which languages will be used is dynamic, keeping titles for desired languages in the main table like TITLE_EN, TITLE_FR is not an option. So we have complicated queries like below:
SELECT
P.ID,
(...)
T1.VALUE AS TITLE,
T2.VALUE AS BRAND
FROM
tblProducts P
-- first join for TITLEs in the selected language
LEFT JOIN
tblTranslations T1 ON T1.ROW_ID = P.ID
AND T1.COL_NAME = 'TITLE'
AND T1.LANG = '{$selectedLang}'
-- second join for BRANDs
LEFT JOIN
tblTranslations T2 ON T2.ROW_ID = P.ID
AND T2.COL_NAME = 'BRAND'
AND T2.LANG = '{$selectedLang}'
WHERE (...)
This was an overly simplified example, our real life queries have many other tables joined for dynamic attributes etc., and this makes our websites start crawling on the ground.
My question: is it a better approach to dynamically create a table and use it for SELECTs only? This table will be updated when the main data is updated, and would be re-created when a new language is added.
tblProductsDynamic
ID
TITLE_EN
TITLE_FR
BRAND_EN
BRAND_FR
(...)
SELECT
ID,
TITLE_{$selectedLang} AS TITLE,
BRAND_{$selectedLang} AS BRAND
FROM
tblProductsDynamic
Will this horizontally expanded table give more performance since it lacks all the tiresome joins?
Obviously, pre-computing values is going to be a performance win when you query the data. You have to balance that against the cost:
- Maintaining the unpivoted table is going to be somewhat expensive.
- It is going to be especially expensive when a new language is added.
- You may have data integrity problems, caused by lags in the construction of the summary table.
- Maintaining the triggers, stored procedures or whatever for the summary table complicates the code base.
That said, your left join
s should not be particularly expensive with the right indexes. I would first want to investigate a solution using the base tables you have described. Only then would I think about options for summarizing the data.