Merging Overlapping Intervals
Currently, I have intervals of:
temp_tuple = [[-25, -14], [-21, -16], [-20, -15], [-10, -7], [-8, -5], [-6, -3], [2, 4], [2, 3], [3, 6], [12, 15], [13, 18], [14, 17], [22, 27], [25, 30], [26, 29]]
in an ascending order by the lower bound. My task is to merge overlapping intervals so that the outcome comes out to be:
[-25, -14]
[-10, -3]
[2, 6]
[12, 18]
[22, 30]
My first attempt involved deleting intervals that are completely within previous intervals, like [-21, -16] which falls within [-25, -14]. But deleting objects within a list kept interfering with the loop condition. My second attempt at deleting unnecessary intervals was:
i = 0
j = 1
while i < len(temp_tuples):
while j < len(temp_tuples):
if temp_tuples[i][1] > temp_tuples[j][1]:
del temp_tuples[j]
j += 1
i += 1
but this doesn't delete all the unnecessary intervals for some reason. What should I do?
It makes it a bit easier to process (as in think about) if you instead setup a new list. You additionally also get to keep your original data.
temp_tuple.sort(key=lambda interval: interval[0])
merged = [temp_tuple[0]]
for current in temp_tuple:
previous = merged[-1]
if current[0] <= previous[1]:
previous[1] = max(previous[1], current[1])
else:
merged.append(current)
If you now print(merged)
it would output:
[[-25, -14], [-10, -3], [2, 6], [12, 18], [22, 30]]
This is a numpy solution I came up with:
import numpy as np
def merge_intervals(intervals):
starts = intervals[:,0]
ends = np.maximum.accumulate(intervals[:,1])
valid = np.zeros(len(intervals) + 1, dtype=np.bool)
valid[0] = True
valid[-1] = True
valid[1:-1] = starts[1:] >= ends[:-1]
return np.vstack((starts[:][valid[:-1]], ends[:][valid[1:]])).T
#example
a=[]
a.append([1,3])
a.append([4,10])
a.append([5,12])
a.append([6,8])
a.append([20,33])
a.append([30,35])
b = np.array(a)
print("intervals")
print(b)
print()
print("merged intervals")
print(merge_intervals(b))
Output:
intervals
[[ 1 3]
[ 4 10]
[ 5 12]
[ 6 8]
[20 33]
[30 35]]
merged intervals
[[ 1 3]
[ 4 12]
[20 35]]
Please note that the input array must be sorted by start positions.