Sum of averages vs average of sums

The average of the entries in a column is the sum of the entries in that column, divided by the number of entries. The number of entries is the number of rows. So the sum of the averages is the sum of all the entries in the table, divided by the number of rows.

The average of the row sums is the sum of all entries in the table divided by the number of rows, so you should get the same number either way.


Generally no is correct, it is only the same in specific cases.

ie. Generally

Sum(x) / Sum(y) not equal to Sum (x/y) / n

where n is the total entries x is row entries and y is column entries.

only true if all of the y's are equal

eg: (1/2 + 3/5)/2 = 11/20 (1+3)/(2+5) = 4/7

Where as if y is equal (1/7 + 4/7)/2 = 5/14 (1+4)/(7+7) = 5/14

PS Sorry about posting on dead thread just want it to be right for anyone else looking.


Sigh!! Found my problem -- it was a stupid "dupe error" in my code. I was looking for an error in the "average of sums" logic, but it was in the "sum of averages" logic -- referencing the wrong variable.

Well, anyway, we've demonstrated about 5 ways from Sunday that the sum of averages really IS equal to the average of sums, in case that's important to anyone in the future.


Actually Steve could be correct. I'll give you a simple example and then explain why intelligent people can come up with different answers because in a way, they're both "right."

First row: 5; 6; Second row: 1; 2; Third row: 3; 4;

If you do either the sum of the averages or average of the sums as Daniel asked, then you'll get 7 as the answer. If however, you remove the 1 leaving a hole in your table, then your average of the sums drops to 6 2/3 and your sum of the averages increases to 8.

If your table of data has blanks or missing data points, then the two are almost never the same. If the table of data is equally/evenly distributed without any missing points or holes in the table, then they should always be the same. Anybody can test this out with MS Excel and the =RAND() function. Generate a table with any number of rows/columns and fill in the rows and columns with random numbers or let it generate random numbers for you. Then use =AVERAGE() to average the columns and =SUM() to add up the averages. Then reverse the process and use =SUM() to add the rows and =AVERAGE() to average the sums. If the table is complete, then the two numbers will be precisely the same. If however, your data for any reason is missing entries, then it can vary by a great percentage. Just start deleting data points in the middle of the table and watch the two results greatly fluctuate.

Also of notes is if you flip the rows and columns then you get completely different results, so make sure you're consistent. If you average the rows in the above example and sum the averages, or sum the columns and average the sums, then you get 10.5 with a complete table and 11 and 10, respectively with the 1 missing.


There seems to be some confusion on the correct answer, and as this comes up on the first page of Google I thought it worth throwing in my 2 cents in an attempt to clarify.

As a number of people have already said the "sum of averages is equal to the average of sums" (Google that if you want more info). The Linear Combination of Random Variables Theory clearly shows this as well. A bit of math also shows that irrelevant to which way you calculate the final average your doing the same thing. Which is to add all the scores up and divide by how many there are.

The confusion is coming in because the multiplication of averages does not always equal the average of the multiplications, and that was one of the examples someone (other than the initial poster) used. Once you start multiplying any of the inputs by anything other than a constant you expect different answers. Or to put it another way we don't expect the sum of products to equal the product of sums.

Someone also mentioned that having blanks 'breaks' this rule. I'd agree with this too.