The Levenshtein distance is the algorithm I would recommend. It calculates the minimum number of operations you must do to change 1 string into another. The fewer changes means the strings are more similar...


It seems you are needing some kind of fuzzy matching. Here is java implementation of some set of similarity metrics http://www.dcs.shef.ac.uk/~sam/stringmetrics.html. Here is more detailed explanation of string metrics http://www.cs.cmu.edu/~wcohen/postscript/ijcai-ws-2003.pdf it depends on how fuzzy and how fast your implementation must be.