Length of the intersections between a list an list of list
Since y
contains disjoint ranges and the union of them is also a range, a very fast solution is to first perform a binary search on y
and then count the resulting indices and only return the ones that appear at least 10 times. The complexity of this algorithm is O(Nx log Ny)
with Nx
and Ny
the number of items in respectively x
and y
. This algorithm is nearly optimal (since x
needs to be read entirely).
Actual implementation
First of all, you need to transform your current y
to a Numpy array containing the beginning value of all ranges (in an increasing order) with N
as the last value (assuming N
is excluded for the ranges of y
, or N+1
otherwise). This part can be assumed as free since y
can be computed at compile time in your case. Here is an example:
import numpy as np
y = np.array([1, 4, 8, 10, 13, ..., N])
Then, you need to perform the binary search and check that the values fits in the range of y:
indices = np.searchsorted(y, x, 'right')
# The `0 < indices < len(y)` check should not be needed regarding the input.
# If so, you can use only `indices -= 1`.
indices = indices[(0 < indices) & (indices < len(y))] - 1
Then you need to count the indices and filter the ones with at least :
uniqueIndices, counts = np.unique(indices, return_counts=True)
result = uniqueIndices[counts >= 10]
Here is an example based on your:
x = np.array([1, 2, 3, 4, 5, 6])
# [[1, 2, 3], [4], [5, 6], [7], [8, 9, 10, 11]]
y = np.array([1, 4, 5, 7, 8, 12])
# Actual simplified version of the above algorithm
indices = np.searchsorted(y, x, 'right') - 1
uniqueIndices, counts = np.unique(indices, return_counts=True)
result = uniqueIndices[counts >= 2]
# [0, 2]
print(result.tolist())
It runs in less than 0.1 ms on my machine on a random input based on your input constraints.