R function rep() in Python (replicates elements of a list/vector)

The R function rep() replicates each element of a vector:

> rep(c("A","B"), times=2)
[1] "A" "B" "A" "B"

This is like the list multiplication in Python:

>>> ["A","B"]*2
['A', 'B', 'A', 'B']

But with the rep() R function it is also possible to specifiy the number of repeats for each element of the vector:

> rep(c("A","B"), times=c(2,3))
[1] "A" "A" "B" "B" "B"

Is there such a function availbale in Python ? Otherwise how could one define it ? By the way I'm also interested in such a function for duplicating rows of an array.


Solution 1:

Use numpy arrays and the numpy.repeat function:

import numpy as np

x = np.array(["A", "B"])
print np.repeat(x, [2, 3], axis=0)

['A' 'A' 'B' 'B' 'B']

Solution 2:

Not sure if there's a built-in available for this, but you can try something like this:

>>> lis = ["A", "B"]
>>> times = (2, 3)
>>> sum(([x]*y for x,y in zip(lis, times)),[])
['A', 'A', 'B', 'B', 'B']

Note that sum() runs in quadratic time. So, it's not the recommended way.

>>> from itertools import chain, izip, starmap
>>> from operator import mul
>>> list(chain.from_iterable(starmap(mul, izip(lis, times))))
['A', 'A', 'B', 'B', 'B']

Timing comparions:

>>> lis = ["A", "B"] * 1000
>>> times = (2, 3) * 1000
>>> %timeit list(chain.from_iterable(starmap(mul, izip(lis, times))))
1000 loops, best of 3: 713 µs per loop
>>> %timeit sum(([x]*y for x,y in zip(lis, times)),[])
100 loops, best of 3: 15.4 ms per loop