Group by clause in mySQL and postgreSQL, why the error in postgreSQL?

Suppose I have this table: named = the_table whose structure is:

postgreSQL:

 create table the_table (col3 SERIAL, col2 varchar, col1 varchar, PRIMARY KEY(col3));

MySQL:

create table the_table ( col3 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, col2 varchar(20), col1 varchar(20) )

Then I inserted the table:

INSERT INTO the_table (col2,col1) VALUES 
('x','a'),
('x','b'),
('y','c'),
('y','d'),
('z','e'),
('z','f');

Now the table looks like this:

col3 | col2 | col1 
------+------+------
    1 | x    | a
    2 | x    | b
    3 | y    | c
    4 | y    | d
    5 | z    | e
    6 | z    | f

When I do this query:

select * from the_table group by col2

then in mysql I get:

1 x a
3 y c
5 z e

and in postgreSQL, I am getting error:

ERROR:  column "the_table.col3" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from the_table group by col2;

My Questions:

What does this error mean? What is aggregate function ?

When it works in MySQL , why can't it work in postgreSQL ?


Solution 1:

You need to use AGGREGATE FUNCTION:

Aggregate functions compute a single result from a set of input values.

SELECT col2, MIN(col3) AS col3, MIN(col1) AS col1
FROM the_table 
GROUP BY col2;

db<>fiddle demo


MySQL Handling of GROUP BY:

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause

and:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate

So with MySQL version without explicit aggregate function you may end up with undetermininistic values. I strongly suggest to use specific aggregate function.


EDIT:

From MySQL Handling of GROUP BY:

SQL92 and earlier does not permit queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are not named in the GROUP BY clause.

SQL99 and later permits such nonaggregates per optional feature T301 if they are functionally dependent on GROUP BY columns: If such a relationship exists between name and custid, the query is legal. This would be the case, for example, were custid a primary key of customers.

Example:

SELECT o.custid, c.name, MAX(o.payment)
FROM orders AS o
JOIN customers AS c
  ON o.custid = c.custid
GROUP BY o.custid;

Solution 2:

Alternatively on the MySQL answer: It wouldn't work in 5.7 version onwards.

You can use ANY_VALUE() function as stated in MySQL documentation.

Sources: https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_any-value

Example:

SELECT MIN(col1), col2, ANY_VALUE(col3) FROM the_table GROUP BY col2