What is a simple way to check if two words overlap at all in Python using indices
I am trying to determine whether or not two words overlap inside of a string by using their indices (start and end positions) and not the word.
For example:
str = "testme"
start_word_1 = 0
end_word_1 = 4
start_word_2 = 4
end_word_2 = 6
In this example str[0:4] is "test" and str[4:6] is "me". These do not overlap but the end of word 1 is the same as start of word 2. So these are okay. I just feel like I'm making it too complicated and there is more simple code for this to cover words that entirely overlap and words that only partially overlap. Thank you!
To clarify overlap: I mean in the string slice str[0:4] ("test") and str[4:6] ("me") do not overlap. They are okay.
But str[0:5] ("testm") does overlap with str[4:6] ("me").
Also, str[0:6] ("testme") does overlap with str[1:4] ("est"), where this one here "est" is completely inside of "testme."
This is going to be used for text highlighting where I do not want anything to be conflicting highlights.
I think something like this is pretty simple and meets your (admittedly a little odd) requirements. Note this is using strings that look like slices, corresponding to your use of indices rather than the words themselves:
In [1]: def words_overlap(slice1, slice2):
...: """Take two strings representing slices (e.g. 'x:y') and
...: return a boolean indicating whether they overlap"""
...: if slice1[0] < slice2[0]: # slice1 is leftmost
...: return slice2[0] < slice1[2] # slice2 ends before slice1 starts
...: else:
...: return slice1[0] < slice2[2]
...:
In [2]: words_overlap('1:3', '2:4')
Out[2]: True
In [3]: words_overlap('2:4', '1:3')
Out[3]: True
In [4]: words_overlap('2:3', '5:7')
Out[4]: False
In [5]: words_overlap('0:4', '4:6')
Out[5]: False
All it does is detect which slice is leftmost using a simple less than test, and then tell you if the rightmost slice starts before the leftmost one ends.
Should be quite efficient since it only involves two integer comparisons.
Maybe like that?
start1 < end2 and start2 < end1
This means: each of the two words begins earlier than the other word ends, so they do overlap
Case 1: Word 2 is a substring of (or matches) Word 1 (or vice versa)
word1 in word2 or word2 in word1
ex. "StackExchange" and "tack"
Case 2: Word 2 overlaps at Word 1's end
for i in range(1, len(word2)):
word1.endswith(word2[:-i])
ex. "stackoverflow" and "flowing"
Case 3: Word 2 overlaps at Word 1's beginning
for i in range(1, len(word2)):
word1.startswith(word2[i:])
ex. "badges" and "terribad"