fill vector values between specified indices

Here is a method using diff on the indices to obtain block sizes and then repeat on the values to create the blocks.

def fill(a,ind,f=lambda x:x+10,default=-50):
    sizes = np.diff(ind,prepend=0,append=len(a))
    values = np.concatenate([[default],f(a[ind])])
    return values.repeat(sizes)

While np.repeat is clearly the way to do here, np.cumsum is also an option. The only thing you need to calculate is the difference between the successive elements. Given that np.diff is basically the inverse of np.cumsum and zero elements don't affect the cumsum, you can do something like this:

def fill_cumsum(a, ind, f=lambda x:x + 10, default=-50):
    vals = np.diff(f(a[ind]))
    a = np.zeros_like(a)
    a[0] = default      # Do this first
    a[ind[0]] = a[ind[0] - np.sign(ind[0]) * default
    a[ind[1:]] = vals   # Overwrite zero automatically
    return a.cumsum()

If you want to do the same thing in-place, just change a = np.zeros_like(a) to a[:] = 0 and add out=a to the return cumsum.

The two answers are almost the same speed:

a = np.random.randint(1000, size=10000)
ind = np.unique(np.random.randint(10000, size=100))
%timeit_repeat fill(a, ind)
43 µs ± 659 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit fill_cumsum(a, ind)
35.6 µs ± 367 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

a = np.random.randint(1000, size=100000)
ind = np.unique(np.random.randint(100000, size=100))
%timeit fill(a, ind)
237 µs ± 592 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit fill_mp(a, ind)
245 µs ± 521 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This answer works great for integers, but np.repeat introduces less floating point roundoff error since it does not call np.diff.

fill vector values between specified indices

Related

Recent Posts