How to group a list of tuples/objects by similar index/attribute in python?

Given a list

old_list = [obj_1, obj_2, obj_3, ...]

I want to create a list:

new_list = [[obj_1, obj_2], [obj_3], ...]

where obj_1.some_attr == obj_2.some_attr.

I could throw some for loops and if checks together, but this is ugly. Is there a pythonic way for this? by the way, the attributes of the objects are all strings.

Alternatively a solution for a list containing tuples (of the same length) instead of objects is appreciated, too.


defaultdict is how this is done.

While for loops are largely essential, if statements aren't.

from collections import defaultdict


groups = defaultdict(list)

for obj in old_list:
    groups[obj.some_attr].append(obj)

new_list = groups.values()

Here are two cases. Both require the following imports:

import itertools
import operator

You'll be using itertools.groupby and either operator.attrgetter or operator.itemgetter.

For a situation where you're grouping by obj_1.some_attr == obj_2.some_attr:

get_attr = operator.attrgetter('some_attr')
new_list = [list(g) for k, g in itertools.groupby(sorted(old_list, key=get_attr), get_attr)]

For a[some_index] == b[some_index]:

get_item = operator.itemgetter(some_index)
new_list = [list(g) for k, g in itertools.groupby(sorted(old_list, key=get_item), get_item)]

Note that you need the sorting because itertools.groupby makes a new group when the value of the key changes.


Note that you can use this to create a dict like S.Lott's answer, but don't have to use collections.defaultdict.

Using a dictionary comprehension (only works with Python 3+, and possibly Python 2.7 but I'm not sure):

groupdict = {k: g for k, g in itertools.groupby(sorted_list, keyfunction)}

For previous versions of Python, or as a more succinct alternative:

groupdict = dict(itertools.groupby(sorted_list, keyfunction))

Think you can also try to use itertools.groupby. Please note that code below is just a sample and should be modified according to your needs:

data = [[1,2,3],[3,2,3],[1,1,1],[7,8,9],[7,7,9]]

from itertools import groupby

# for example if you need to get data grouped by each third element you can use the following code
res = [list(v) for l,v in groupby(sorted(data, key=lambda x:x[2]), lambda x: x[2])]# use third element for grouping