How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?
My table is:
id | home | datetime | player | resource |
---|---|---|---|---|
1 | 10 | 04/03/2009 | john | 399 |
2 | 11 | 04/03/2009 | juliet | 244 |
5 | 12 | 04/03/2009 | borat | 555 |
3 | 10 | 03/03/2009 | john | 300 |
4 | 11 | 03/03/2009 | juliet | 200 |
6 | 12 | 03/03/2009 | borat | 500 |
7 | 13 | 24/12/2008 | borat | 600 |
8 | 13 | 01/01/2009 | borat | 700 |
I need to select each distinct home
holding the maximum value of datetime
.
Result would be:
id | home | datetime | player | resource |
---|---|---|---|---|
1 | 10 | 04/03/2009 | john | 399 |
2 | 11 | 04/03/2009 | juliet | 244 |
5 | 12 | 04/03/2009 | borat | 555 |
8 | 13 | 01/01/2009 | borat | 700 |
I have tried:
-- 1 ..by the MySQL manual:
SELECT DISTINCT
home,
id,
datetime AS dt,
player,
resource
FROM topten t1
WHERE datetime = (SELECT
MAX(t2.datetime)
FROM topten t2
GROUP BY home)
GROUP BY datetime
ORDER BY datetime DESC
Doesn't work. Result-set has 130 rows although database holds 187.
Result includes some duplicates of home
.
-- 2 ..join
SELECT
s1.id,
s1.home,
s1.datetime,
s1.player,
s1.resource
FROM topten s1
JOIN (SELECT
id,
MAX(datetime) AS dt
FROM topten
GROUP BY id) AS s2
ON s1.id = s2.id
ORDER BY datetime
Nope. Gives all the records.
-- 3 ..something exotic:
With various results.
You are so close! All you need to do is select BOTH the home and its max date time, then join back to the topten
table on BOTH fields:
SELECT tt.*
FROM topten tt
INNER JOIN
(SELECT home, MAX(datetime) AS MaxDateTime
FROM topten
GROUP BY home) groupedtt
ON tt.home = groupedtt.home
AND tt.datetime = groupedtt.MaxDateTime
The fastest MySQL
solution, without inner queries and without GROUP BY
:
SELECT m.* -- get the row that contains the max value
FROM topten m -- "m" from "max"
LEFT JOIN topten b -- "b" from "bigger"
ON m.home = b.home -- match "max" row with "bigger" row by `home`
AND m.datetime < b.datetime -- want "bigger" than "max"
WHERE b.datetime IS NULL -- keep only if there is no bigger than max
Explanation:
Join the table with itself using the home
column. The use of LEFT JOIN
ensures all the rows from table m
appear in the result set. Those that don't have a match in table b
will have NULL
s for the columns of b
.
The other condition on the JOIN
asks to match only the rows from b
that have bigger value on the datetime
column than the row from m
.
Using the data posted in the question, the LEFT JOIN
will produce this pairs:
+------------------------------------------+--------------------------------+
| the row from `m` | the matching row from `b` |
|------------------------------------------|--------------------------------|
| id home datetime player resource | id home datetime ... |
|----|-----|------------|--------|---------|------|------|------------|-----|
| 1 | 10 | 04/03/2009 | john | 399 | NULL | NULL | NULL | ... | *
| 2 | 11 | 04/03/2009 | juliet | 244 | NULL | NULL | NULL | ... | *
| 5 | 12 | 04/03/2009 | borat | 555 | NULL | NULL | NULL | ... | *
| 3 | 10 | 03/03/2009 | john | 300 | 1 | 10 | 04/03/2009 | ... |
| 4 | 11 | 03/03/2009 | juliet | 200 | 2 | 11 | 04/03/2009 | ... |
| 6 | 12 | 03/03/2009 | borat | 500 | 5 | 12 | 04/03/2009 | ... |
| 7 | 13 | 24/12/2008 | borat | 600 | 8 | 13 | 01/01/2009 | ... |
| 8 | 13 | 01/01/2009 | borat | 700 | NULL | NULL | NULL | ... | *
+------------------------------------------+--------------------------------+
Finally, the WHERE
clause keeps only the pairs that have NULL
s in the columns of b
(they are marked with *
in the table above); this means, due to the second condition from the JOIN
clause, the row selected from m
has the biggest value in column datetime
.
Read the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
Here goes T-SQL version:
-- Test data
DECLARE @TestTable TABLE (id INT, home INT, date DATETIME,
player VARCHAR(20), resource INT)
INSERT INTO @TestTable
SELECT 1, 10, '2009-03-04', 'john', 399 UNION
SELECT 2, 11, '2009-03-04', 'juliet', 244 UNION
SELECT 5, 12, '2009-03-04', 'borat', 555 UNION
SELECT 3, 10, '2009-03-03', 'john', 300 UNION
SELECT 4, 11, '2009-03-03', 'juliet', 200 UNION
SELECT 6, 12, '2009-03-03', 'borat', 500 UNION
SELECT 7, 13, '2008-12-24', 'borat', 600 UNION
SELECT 8, 13, '2009-01-01', 'borat', 700
-- Answer
SELECT id, home, date, player, resource
FROM (SELECT id, home, date, player, resource,
RANK() OVER (PARTITION BY home ORDER BY date DESC) N
FROM @TestTable
)M WHERE N = 1
-- and if you really want only home with max date
SELECT T.id, T.home, T.date, T.player, T.resource
FROM @TestTable T
INNER JOIN
( SELECT TI.id, TI.home, TI.date,
RANK() OVER (PARTITION BY TI.home ORDER BY TI.date) N
FROM @TestTable TI
WHERE TI.date IN (SELECT MAX(TM.date) FROM @TestTable TM)
)TJ ON TJ.N = 1 AND T.id = TJ.id
EDIT
Unfortunately, there are no RANK() OVER function in MySQL.
But it can be emulated, see Emulating Analytic (AKA Ranking) Functions with MySQL.
So this is MySQL version:
SELECT id, home, date, player, resource
FROM TestTable AS t1
WHERE
(SELECT COUNT(*)
FROM TestTable AS t2
WHERE t2.home = t1.home AND t2.date > t1.date
) = 0
This will work even if you have two or more rows for each home
with equal DATETIME
's:
SELECT id, home, datetime, player, resource
FROM (
SELECT (
SELECT id
FROM topten ti
WHERE ti.home = t1.home
ORDER BY
ti.datetime DESC
LIMIT 1
) lid
FROM (
SELECT DISTINCT home
FROM topten
) t1
) ro, topten t2
WHERE t2.id = ro.lid