MySQL groupwise MAX() returns unexpected results
TABLE: LOAN
Loan_no Amount SSS_no Loan_date
7 700.00 0104849222 2010-01-03
8 200.00 0104849222 2010-02-28
9 300.00 0119611199 2010-11-18
10 150.00 3317131410 2012-11-28
11 600.00 0104849222 2011-01-03
14 175.00 3317131410 2012-12-05
15 260.00 3317131410 2013-02-08
16 230.00 0104849222 2013-03-06
17 265.00 0119611199 2011-04-30
18 455.00 3317131410 2013-03-10
DESIRED RESULTS:
I would want to retrieve the latest loan availed off by each person (identified by their SSS number). The results should be as follows:
Loan_no Amount SSS_no Loan_date
16 230.00 0104849222 2013-03-06
17 265.00 0119611199 2011-04-30
18 455.00 3317131410 2013-03-10
QUERY # 1 USED:
SELECT *
FROM loan
GROUP BY SSS_no
ORDER BY Loan_date DESC
MYSQL RESULT
Loan_no Amount SSS_no Loan_date
10 150.00 3317131410 2012-11-28
9 300.00 0119611199 2010-11-18
7 700.00 0104849222 2010-01-03
QUERY # 2 USED:
SELECT Loan_no, Amount, SSS_no, max(Loan_date)
FROM loan
GROUP BY SSS_no
MYSQL RESULT
Loan_no Amount SSS_no Loan_date
7 700.00 0104849222 2013-03-06
9 300.00 0119611199 2011-04-30
10 150.00 3317131410 2013-03-10
Can anybody help me with my problem? Thanks.
Solution 1:
Try this instead:
SELECT l1.*
FROM loan AS l1
INNER JOIN
(
SELECT SSS_no, MAX(Loan_date) LatestDate
FROM loan
GROUP BY SSS_no
) AS l2 ON l1.SSS_no = l2.SSS_no
AND l1.loan_date = l2.LatestDate;
SQL Fiddle Demo
This will give you:
| LOAN_NO | AMOUNT | SSS_NO | LOAN_DATE |
----------------------------------------------
| 16 | 230 | 104849222 | 2013-03-06 |
| 17 | 265 | 119611199 | 2011-04-30 |
| 18 | 455 | 3317131410 | 2013-03-10 |
Solution 2:
The reason why you are getting unexpected results is because you are using a GROUP BY
on only one column in the SELECT
list and you are not using any aggregate functions on all the columns.
MySQL uses an extension to the GROUP BY
function which can cause unexpected results when you do not GROUP BY
or aggregate all items in the SELECT
list. (see MySQL Extensions to GROUP BY)
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
The only way that you can be sure to return the correct result is to alter your query to aggregate and GROUP BY
properly.
So you can use something similar to this:
select l1.loan_no,
l1.amount,
l1.SSS_no,
l1.loan_date
from loan l1
inner join
(
select SSS_no, max(loan_date) Loan_date
from loan
group by SSS_no
) l2
on l1.SSS_no = l2.SSS_no
and l1.loan_date = l2.loan_date
See SQL Fiddle with Demo
This implements a subquery to get the max(loan_date)
for each SSS_no
. This subquery is then joined back to your table on both the SSS_no
and the max loan_date
which will guarantee that you get the correct result for each SSS_no
.