How to compare list of objects and keep only _new_ objects?
Solution 1:
Took me a while to get, thanks for answering my questions, and it seems like "updated" might simply be expressed as "new not in old"?
I think so, because the following seems to do the job.
The key is to make comparisons of the objects themselves, and not wanting to get into object comparison (deep-equal), just hashing each object back to JSON gives us string representations we can compare:
import json
old_hashes = []
old_objs = json.load(open('old.json'))
for old_obj in old_objs:
old_hash = json.dumps(old_obj)
old_hashes.append(old_hash)
# "Updated" means "new not in old"
updated_objs = []
new_objs = json.load(open('new.json'))
for new_obj in new_objs:
new_hash = json.dumps(new_obj)
if new_hash not in old_hashes:
updated_objs.append(new_obj)
print(json.dumps(updated_objs, indent=2))
When I run that against your old.json and new.json, I get:
[
{
"name": "Mohan raj",
"age": 23,
"country": "INDIA"
},
{
"name": "Kiruthika",
"age": 18,
"country": "INDIA"
},
{
"name": "Munusamy",
"age": 45,
"country": "INDIA"
},
{
"name": "Mark Smith",
"age": 25,
"country": "USA"
}
]
Solution 2:
You can achieve it below. keep track of data from old json.
import json
# read json file
with open('new.json') as f:
new_data = json.load(f)
with open('old.json') as f:
old_data = json.load(f)
old_json_list = [
{elem["name"], elem["age"], elem["country"]} for elem in old_data]
updated_list = []
for elem in new_data:
elm = {elem["name"], elem["age"], elem["country"]}
if elm not in old_json_list:
updated_list.append(elem)
with open('updated.json', 'w') as f:
json.dump(updated_list, f)