A loop that makes multi-conditional summations

Following code doesn't use any package. Starting from Python 3.7 all dicts are insertion-ordered, this fact is used in following code so that final result has order of original appearance of elements. If for some reason your python is below 3.7, tell me, I'll modify code to explicitly do ordering instead of relying on this language feature.

Try it online!

df = [["john","2019","30.2"], ["john","2019","40"], ["john","2020","50.3"],
      ["amy","2019","60"], ["amy","2019","20"], ["amy","2020","40.1"]]

r = {}
for *a, b in df:
    a = tuple(a)
    if a not in r:
       r[a] = 0
    r[a] += float(b)
r = [list(k) + [str(v)] for k, v in r.items()]

print(r)

Output:

[['john', '2019', '70.2'], ['john', '2020', '50.3'], ['amy', '2019', '80.0'], ['amy', '2020', '40.1']]

Since you are using df variable name I am assuming you are familiar with pandas.

You can easily do this in pandas. Just convert your list into df.

And the groupby columns which you want unique values and select the last row

df.groupby(['col_a', 'col_b'], as_index=False).last()

You can sort the df before calling groupby if you have any custom logic

Here's a way to do it using defaultdict:

from collections import defaultdict
sums = defaultdict(lambda: defaultdict(float))
for item in df:
    sums[item[0]][item[1]] += float(item[2])
lst = [[key, inner_key, value] for key in sums for inner_key, value in sums[key].items()]

A loop that makes multi-conditional summations

Related

Recent Posts