Aggregate columns with additional (distinct) filters
This code works as expected, but I it's long and creepy.
select p.name, p.played, w.won, l.lost from
(select users.name, count(games.name) as played
from users
inner join games on games.player_1_id = users.id
where games.winner_id > 0
group by users.name
union
select users.name, count(games.name) as played
from users
inner join games on games.player_2_id = users.id
where games.winner_id > 0
group by users.name) as p
inner join
(select users.name, count(games.name) as won
from users
inner join games on games.player_1_id = users.id
where games.winner_id = users.id
group by users.name
union
select users.name, count(games.name) as won
from users
inner join games on games.player_2_id = users.id
where games.winner_id = users.id
group by users.name) as w on p.name = w.name
inner join
(select users.name, count(games.name) as lost
from users
inner join games on games.player_1_id = users.id
where games.winner_id != users.id
group by users.name
union
select users.name, count(games.name) as lost
from users
inner join games on games.player_2_id = users.id
where games.winner_id != users.id
group by users.name) as l on l.name = p.name
As you can see, it consists of 3 repetitive parts for retrieving:
- player name and the amount of games they played
- player name and the amount of games they won
- player name and the amount of games they lost
And each of those also consists of 2 parts:
- player name and the amount of games in which they participated as player_1
- player name and the amount of games in which they participated as player_2
How could this be simplified?
The result looks like so:
name | played | won | lost
---------------------------+--------+-----+------
player_a | 5 | 2 | 3
player_b | 3 | 2 | 1
player_c | 2 | 1 | 1
Solution 1:
The aggregate FILTER
clause in Postgres 9.4 or newer is shorter and faster:
SELECT u.name
, count(*) FILTER (WHERE g.winner_id > 0) AS played
, count(*) FILTER (WHERE g.winner_id = u.id) AS won
, count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
FROM games g
JOIN users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP BY u.name;
- The manual
- Postgres Wiki
- Depesz blog post
In Postgres 9.3 (or any version) this is still shorter and faster than nested sub-selects or CASE
expressions:
SELECT u.name
, count(g.winner_id > 0 OR NULL) AS played
, count(g.winner_id = u.id OR NULL) AS won
, count(g.winner_id <> u.id OR NULL) AS lost
FROM games g
JOIN users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP BY u.name;
Details:
- For absolute performance, is SUM faster or COUNT?
Solution 2:
This is a case where correlated subqueries may simplify the logic:
select u.*, (played - won) as lost
from (select u.*,
(select count(*)
from games g
where g.player_1_id = u.id or g.player_2_id = u.id
) as played,
(select count(*)
from games g
where g.winner_id = u.id
) as won
from users u
) u;
This assumes that there are no ties.
Solution 3:
select users.name,
count(case when games.winner_id > 0
then games.name
else null end) as played,
count(case when games.winner_id = users.id
then games.name
else null end) as won,
count(case when games.winner_id != users.id
then games.name
else null end) as lost
from users inner join games
on games.player_1_id = users.id or games.player_2_id = users.id
group by users.name;