MySQL groupwise MAX() returns unexpected results

TABLE: LOAN

Loan_no     Amount        SSS_no              Loan_date

7           700.00        0104849222          2010-01-03 
8           200.00        0104849222          2010-02-28
9           300.00        0119611199          2010-11-18
10          150.00        3317131410          2012-11-28
11          600.00        0104849222          2011-01-03
14          175.00        3317131410          2012-12-05
15          260.00        3317131410          2013-02-08
16          230.00        0104849222          2013-03-06
17          265.00        0119611199          2011-04-30
18          455.00        3317131410          2013-03-10

DESIRED RESULTS:

I would want to retrieve the latest loan availed off by each person (identified by their SSS number). The results should be as follows:

Loan_no           Amount                SSS_no                Loan_date

16                230.00              0104849222              2013-03-06
17                265.00              0119611199              2011-04-30
18                455.00              3317131410              2013-03-10

QUERY # 1 USED:

SELECT * FROM loan GROUP BY SSS_no ORDER BY Loan_date DESC

MYSQL RESULT

Loan_no             Amount              SSS_no                  Loan_date

10                  150.00            3317131410                2012-11-28
9                   300.00            0119611199                2010-11-18
7                   700.00            0104849222                2010-01-03

QUERY # 2 USED:

SELECT Loan_no, Amount, SSS_no, max(Loan_date) FROM loan GROUP BY SSS_no

MYSQL RESULT

Loan_no            Amount                SSS_no                Loan_date

7                  700.00                0104849222            2013-03-06
9                  300.00                0119611199            2011-04-30
10                 150.00                3317131410            2013-03-10

Can anybody help me with my problem? Thanks.


Solution 1:

Try this instead:

SELECT l1.*
FROM loan AS l1
INNER JOIN
(
   SELECT SSS_no, MAX(Loan_date) LatestDate
   FROM loan
   GROUP BY SSS_no
) AS l2  ON l1.SSS_no    = l2.SSS_no 
        AND l1.loan_date = l2.LatestDate;

SQL Fiddle Demo

This will give you:

| LOAN_NO | AMOUNT |     SSS_NO |  LOAN_DATE |
----------------------------------------------
|      16 |    230 |  104849222 | 2013-03-06 |
|      17 |    265 |  119611199 | 2011-04-30 |
|      18 |    455 | 3317131410 | 2013-03-10 |

Solution 2:

The reason why you are getting unexpected results is because you are using a GROUP BY on only one column in the SELECT list and you are not using any aggregate functions on all the columns.

MySQL uses an extension to the GROUP BY function which can cause unexpected results when you do not GROUP BY or aggregate all items in the SELECT list. (see MySQL Extensions to GROUP BY)

From the MySQL Docs:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

The only way that you can be sure to return the correct result is to alter your query to aggregate and GROUP BY properly.

So you can use something similar to this:

select l1.loan_no,
  l1.amount,
  l1.SSS_no,
  l1.loan_date
from loan l1
inner join
(
  select SSS_no, max(loan_date) Loan_date
  from loan
  group by SSS_no
) l2
  on l1.SSS_no = l2.SSS_no
  and l1.loan_date = l2.loan_date

See SQL Fiddle with Demo

This implements a subquery to get the max(loan_date) for each SSS_no. This subquery is then joined back to your table on both the SSS_no and the max loan_date which will guarantee that you get the correct result for each SSS_no.