Why do we need GROUP BY with AGGREGATE FUNCTIONS?

I saw an example where there was a list (table) of employees with their respective monthly salaries. I did a sum of the salaries and saw the exact same table in the ouptput. That was strange.

Here is what has to be done - we have to find out how much money we pay this month as employee salaries. For that, we need to sum their salary amounts in the database as shown:

SELECT EmployeeID, SUM (MonthlySalary) 
FROM Employee
GROUP BY EmpID

I know that I get an error if I don't use GROUP BY in the above code. This is what I don't understand.

We are selecting EmployeeID from the Employee table. SUM() is being told that it has to add the MonthlySalary column, from the Employee table. So, it should directly go and add those numbers up instead of grouping them and then adding them.

Thats how a person would do it - look at the employee table and add all the numbers. Why would they take the trouble to group them and then add them up?


It might be easier if you think of GROUP BY as "for each" for the sake of explanation. The query below:

SELECT empid, SUM (MonthlySalary) 
FROM Employee
GROUP BY EmpID

is saying:

"Give me the sum of MonthlySalary's for each empid"

So if your table looked like this:

+-----+------------+
|empid|MontlySalary|
+-----+------------+
|1    |200         |
+-----+------------+
|2    |300         |
+-----+------------+

result:

+-+---+
|1|200|
+-+---+
|2|300|
+-+---+

Sum wouldn't appear to do anything because the sum of one number is that number. On the other hand if it looked like this:

+-----+------------+
|empid|MontlySalary|
+-----+------------+
|1    |200         |
+-----+------------+
|1    |300         |
+-----+------------+
|2    |300         |
+-----+------------+

result:

+-+---+
|1|500|
+-+---+
|2|300|
+-+---+

Then it would because there are two empid 1's to sum together. Not sure if this explanation helps or not, but I hope it makes things a little clearer.


If you wanted to add up all the numbers you would not have a GROUP BY:


SELECT SUM(MonthlySalary) AS TotalSalary
FROM Employee
+-----------+
|TotalSalary|
+-----------+
|777400     |
+-----------+

The point of the GROUP BY is that you get a separate total for each employee.

+--------+------+
|Employee|Salary|
+--------+------+
|John    |123400|
+--------+------+
|Frank   |413000|
+--------+------+
|Bill    |241000|
+--------+------+