Comparing 2 lists consisting of dictionaries with unique keys in python

I have 2 lists, both of which contain same number of dictionaries. Each dictionary has a unique key. There is a match for each dictionary of the first list in the second list, that is a dictionary with a unique key exists in the other list. But the other elements of such 2 dictionaries may vary. For example:

list_1 = [
            {
                'unique_id': '001',
                'key1': 'AAA',
                'key2': 'BBB',
                'key3': 'EEE'
             },
             {
                'unique_id': '002',
                'key1': 'AAA',
                'key2': 'CCC',
                'key3': 'FFF'
             }
         ]

 list_2 = [
             {
                'unique_id': '001',
                'key1': 'AAA',
                'key2': 'DDD',
                'key3': 'EEE'
             },
             {
                'unique_id': '002',
                'key1': 'AAA',
                'key2': 'CCC',
                'key3': 'FFF'
             }
         ]

I want to compare all elements of 2 matching dictionaries. If any of the elements are not equal, I want to print the none-equal elements.

Would you please help?


Solution 1:

Assuming that the dicts line up like in your example input, you can use the zip() function to get a list of associated pairs of dicts, then you can use any() to check if there is a difference:

>>> list_1 = [{'unique_id':'001', 'key1':'AAA', 'key2':'BBB', 'key3':'EEE'}, 
              {'unique_id':'002', 'key1':'AAA', 'key2':'CCC', 'key3':'FFF'}]
>>> list_2 = [{'unique_id':'001', 'key1':'AAA', 'key2':'DDD', 'key3':'EEE'},
              {'unique_id':'002', 'key1':'AAA', 'key2':'CCC', 'key3':'FFF'}]
>>> pairs = zip(list_1, list_2)
>>> any(x != y for x, y in pairs)
True

Or to get the differing pairs:

>>> [(x, y) for x, y in pairs if x != y]
[({'key3': 'EEE', 'key2': 'BBB', 'key1': 'AAA', 'unique_id': '001'}, {'key3': 'EEE', 'key2': 'DDD', 'key1': 'AAA', 'unique_id': '001'})]

You can even get the keys which don't match for each pair:

>>> [[k for k in x if x[k] != y[k]] for x, y in pairs if x != y]
[['key2']]

Possibly together with the associated values:

>>> [[(k, x[k], y[k]) for k in x if x[k] != y[k]] for x, y in pairs if x != y]
[[('key2', 'BBB', 'DDD')]]

NOTE: In case you're input lists are not sorted yet, you can do that easily as well:

>>> from operator import itemgetter
>>> list_1, list_2 = [sorted(l, key=itemgetter('unique_id')) 
                      for l in (list_1, list_2)]

Solution 2:

The fastest and most comprehensive way would be, to use two sets of tuples:

set_list1 = set(tuple(sorted(d.items())) for d in sorted(list1))
set_list2 = set(tuple(sorted(d.items())) for d in sorted(list2))
    

(if your list is already sorted, simply remove the list sort to save performance)

Find overlapping using intersection:

set_overlapping = set_list1.intersection(set_list2)

Find difference using symmetric_difference

set_difference = set_list1.symmetric_difference(set_list2)

Convert tuple back to dict

 for tuple_element in set_difference:
     list_dicts_difference.append(dict((x, y) for x, y in tuple_element))

Solution 3:

The following compares the dictionaries and prints the non-equal items:

for d1, d2 in zip(list_1, list_2):
    for key, value in d1.items():
        if value != d2[key]:
            print key, value, d2[key]

Output: key2 BBB DDD. By using zip we can iterate over two dictionaries at a time. We then iterate over the items of the first dictionary and compare the value with the corresponding value in the second dictionary. If these are not equal, then we print the key and both values.

Solution 4:

I have a version that actually does not depends on a particular key, so the elements are equal (zero) or they are not (non-zer):

list_1 = [{'unique_id':'001', 'key1':'AAA', 'key2':'BBB', 'key3':'EEE'}, {'unique_id':'002', 'key1':'AAA', 'key2':'CCC', 'key3':'FFF'}]
list_2 = [{'unique_id':'001', 'key1':'AAA', 'key2':'DDD', 'key3':'EEE'}, {'unique_id':'002', 'key1':'AAA', 'key2':'CCC', 'key3':'FFF'}]
list_3 = [{'Name': 'Abid', 'Age': 27},{'Name': 'Mahnaz', 'Age': 27}]
list_4 = [{'Name': 'Abid', 'Age': 27},{'Name': 'Mahnaz', 'Age': 27}]

print cmp(list_1,list_1)
print cmp(list_1,list_3)
print cmp(list_1,list_2)
print cmp(list_2,list_1)
print cmp(list_3,list_4)

gives:

Return Value :  0
Return Value :  1
Return Value : -1
Return Value :  1
Return Value :  0