Python - compare two string by words using difflib and print only difference
Python newbie here. I have the following code to compare two strings using difflab library. The output is prefixed with '+','-' for words which are different. How to get only the differences printed without any prefix?
The expected output for the below code is
Not in first string: Nvdia
Not in first string: IBM
Not in second string: Microsoft
Not in second string: Google
Not in second string: Oracle
or just Nvdia, IBM, Microsoft, Google, Oracle
import difflib
original = "Apple Microsoft Google Oracle"
edited = "Apple Nvdia IBM"
# initiate the Differ object
d = difflib.Differ()
# calculate the difference between the two texts
diff = d.compare(original.split(), edited.split())
# output the result
print ('\n'.join(diff))
Thanks!
If you don't have to use difflib
, you could use a set
and string splitting!
>>> original = "Apple Microsoft Google Oracle"
>>> edited = "Apple Nvdia IBM"
>>> set(original.split()).symmetric_difference(set(edited.split()))
{'IBM', 'Google', 'Oracle', 'Microsoft', 'Nvdia'}
You can also get the shared members with the .intersection()
>>> set(original.split()).intersection(set(edited.split()))
{'Apple'}
The Wikipedia has a good section on basic set operations with accompanying Venn diagrams
https://en.wikipedia.org/wiki/Set_(mathematics)#Basic_operations
However, if you have to use difflib
(some strange environment or assignment) you can also just find every member with a +-
prefix and slice off the all the prefixes
>>> diff = d.compare(original.split(), edited.split())
>>> list(a[2:] for a in diff if a.startswith(("+", "-")))
['Nvdia', 'IBM', 'Microsoft', 'Google', 'Oracle']
All of these operations result in an iterable of strings, so you can .join()
'em together or similar to get a single result as you do in your Question
>>> print("\n".join(result))
IBM
Google
Oracle
Microsoft
Nvdia