Text difference algorithm
Solution 1:
I can recommend to take a look at Neil Fraser's code and articles:
google-diff-match-patch
Currently available in Java, JavaScript, C++ and Python. Regardless of language, each library features the same API and the same functionality. All versions also have comprehensive test harnesses.
Neil Fraser: Diff Strategies - for theory and implementation notes
Solution 2:
In Python, there is difflib, as also others have suggested.
difflib
offers the SequenceMatcher class, which can be used to give you a similarity ratio. Example function:
def text_compare(text1, text2, isjunk=None):
return difflib.SequenceMatcher(isjunk, text1, text2).ratio()