Repeat each values of an array different times
Suppose a = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
and s = [3, 3, 9, 3, 6, 3]
. I'm looking for the best way to repeat a[i]
exactly s[i]
times and then have a flatten array in the form of b = [0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, ... ]
.
I want to do this as fast as possible since I have to do it many times. I'm using Python and numpy and the arrays are defined as numpy.ndarray. I searched around and find out about repeat
, tile
and column_stack
which can be used nicely to repeat each element n
times but I wanted to repeat each of them different times.
One way to do this is:
a = hsplit(a, 6)
for i in range(len(a)):
a[i] = repeat(a[i], s[i])
a = a.flatten()
I am wondering if there is a better way to do it.
That's exactly what numpy.repeat
does:
>>> a = np.array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6])
>>> s = np.array([3, 3, 9, 3, 6, 3])
>>> np.repeat(a, s)
array([ 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3,
0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5,
0.5, 0.5, 0.6, 0.6, 0.6])
In pure Python you can do something like:
>>> from itertools import repeat, chain, imap
>>> list(chain.from_iterable(imap(repeat, a, s)))
[0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.4, 0.4, 0.4, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.6, 0.6, 0.6]
But of course it is going to be way slower than its NumPy equivalent:
>>> s = [3, 3, 9, 3, 6, 3]*1000
>>> a = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]*1000
>>> %timeit list(chain.from_iterable(imap(repeat, a, s)))
1000 loops, best of 3: 1.21 ms per loop
>>> %timeit np.repeat(a_a, s_a) #a_a and s_a are NumPy arrays of same size as a and b
10000 loops, best of 3: 202 µs per loop