Sum results of a few queries and then find top 5 in SQL
I have 3 queries:
table: pageview
SELECT event_id, count(*) AS pageviews
FROM pageview
GROUP BY event_id
ORDER BY pageviews DESC, rand()
LIMIT 1000
table: upvote
SELECT event_id, count(*) AS upvotes
FROM upvote
GROUP BY event_id
ORDER BY upvotes DESC, rand()
LIMIT 1000
table: attending
SELECT event_id, count(*) AS attendants
FROM attending
GROUP BY event_id
ORDER BY attendants DESC, rand()
LIMIT 1000
I'd like to combine the event_id
s of all 3 queries ordered by amount and then choose the top 5. How do I do that?
EDIT: HERE IS WHAT I DID TO MAKE IT HAPPEN:
SELECT event_id, sum(amount) AS total
FROM (
(SELECT event_id, count(*) AS amount
FROM pageview
GROUP BY event_id
ORDER BY amount DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*) as amount
FROM upvote
GROUP BY event_id
ORDER BY amount DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*) as amount
FROM attending
GROUP BY event_id
ORDER BY amount DESC, rand()
LIMIT 1000)
) x
GROUP BY 1
ORDER BY sum(amount) DESC
LIMIT 5;
Solution 1:
To UNION
the resulting rows of all three queries and then pick the 5 rows with the highest amount
:
(SELECT event_id, count(*) AS amount
FROM pageview
GROUP BY event_id
ORDER BY pageviews DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*)
FROM upvote
GROUP BY event_id
ORDER BY upvotes DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*)
FROM attending
GROUP BY event_id
ORDER BY attendants DESC, rand()
LIMIT 1000)
ORDER BY 2 DESC
LIMIT 5;
The manual:
To apply
ORDER BY
orLIMIT
to an individualSELECT
, place the clause inside the parentheses that enclose theSELECT
.
UNION ALL
to keep duplicates.
To add the counts for every event_id
:
SELECT event_id, sum(amount) AS total
FROM (
(SELECT event_id, count(*) AS amount
FROM pageview
GROUP BY event_id
ORDER BY pageviews DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*)
FROM upvote
GROUP BY event_id
ORDER BY upvotes DESC, rand()
LIMIT 1000)
UNION ALL
(SELECT event_id, count(*)
FROM attending
GROUP BY event_id
ORDER BY attendants DESC, rand()
LIMIT 1000)
) x
GROUP BY 1
ORDER BY sum(amount) DESC
LIMIT 5;
The tricky part here is that not every event_id
will be present in all three base queries. So take care that a JOIN
does not lose rows completely and additions don't turn out NULL
.
Use UNION ALL
, not UNION
. You don't want to remove identical rows, you want to add them up.
x
is a table alias and shorthand for AS x
. It is required for for a subquery to have a name. Can be any other name here.
The SOL feature FULL OUTER JOIN
is not implemented in MySQL (last time I checked), so you have to make do with UNION
. FULL OUTER JOIN
would join all three base queries without losing rows.
Answer to follow-up question
SELECT event_id, sum(amount) AS total
FROM (
(SELECT event_id, count(*) / 100 AS amount
FROM pageview ... )
UNION ALL
(SELECT event_id, count(*) * 5
FROM upvote ... )
UNION ALL
(SELECT event_id, count(*) * 10
FROM attending ... )
) x
GROUP BY 1
ORDER BY sum(amount) DESC
LIMIT 5;
Or, to use the base counts in multiple ways:
SELECT event_id
,sum(CASE source
WHEN 'p' THEN amount / 100
WHEN 'u' THEN amount * 5
WHEN 'a' THEN amount * 10
ELSE 0
END) AS total
FROM (
(SELECT event_id, 'p'::text AS source, count(*) AS amount
FROM pageview ... )
UNION ALL
(SELECT event_id, 'u'::text, count(*)
FROM upvote ... )
UNION ALL
(SELECT event_id, 'a'::text, count(*)
FROM attending ... )
) x
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;