How to compare list of objects and keep only _new_ objects?

Solution 1:

Took me a while to get, thanks for answering my questions, and it seems like "updated" might simply be expressed as "new not in old"?

I think so, because the following seems to do the job.

The key is to make comparisons of the objects themselves, and not wanting to get into object comparison (deep-equal), just hashing each object back to JSON gives us string representations we can compare:

import json

old_hashes = []

old_objs = json.load(open('old.json'))
for old_obj in old_objs:
    old_hash = json.dumps(old_obj)
    old_hashes.append(old_hash)


# "Updated" means "new not in old"
updated_objs = []

new_objs = json.load(open('new.json'))
for new_obj in new_objs:
    new_hash = json.dumps(new_obj)
    if new_hash not in old_hashes:
        updated_objs.append(new_obj)


print(json.dumps(updated_objs, indent=2))

When I run that against your old.json and new.json, I get:

[
  {
    "name": "Mohan raj",
    "age": 23,
    "country": "INDIA"
  },
  {
    "name": "Kiruthika",
    "age": 18,
    "country": "INDIA"
  },
  {
    "name": "Munusamy",
    "age": 45,
    "country": "INDIA"
  },
  {
    "name": "Mark Smith",
    "age": 25,
    "country": "USA"
  }
]

Solution 2:

You can achieve it below. keep track of data from old json.

import json

# read json file
with open('new.json') as f:
    new_data = json.load(f)
with open('old.json') as f:
    old_data = json.load(f)

old_json_list = [
    {elem["name"], elem["age"], elem["country"]} for elem in old_data]
    
updated_list = []
for elem in new_data:
    elm = {elem["name"], elem["age"], elem["country"]}
    if elm not in old_json_list:
        updated_list.append(elem)

with open('updated.json', 'w') as f:
    json.dump(updated_list, f)