Split list into separate but overlapping chunks
Let's say I have a list A
A = [1,2,3,4,5,6,7,8,9,10]
I would like to create a new list (say B
) using the above list in the following order.
B = [[1,2,3], [3,4,5], [5,6,7], [7,8,9], [9,10,]]
i.e. the first 3 numbers as A[0,1,2]
and the second 3 numbers as A[2,3,4]
and so on.
I believe there is a function in numpy
for such a kind of operation.
Simply use Python's built-in list comprehension with list-slicing to do this:
>>> A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> size = 3
>>> step = 2
>>> A = [A[i : i + size] for i in range(0, len(A), step)]
This gives you what you're looking for:
>>> A
[[1, 2, 3], [3, 4, 5], [5, 6, 7], [7, 8, 9], [9, 10]]
But you'll have to write a couple of lines to make sure that your code doesn't break for unprecedented values of size/step.
The 'duplicate' Partition array into N chunks with Numpy suggests np.split
- that's fine for non-overlapping splits. The example (added after the close?) overlaps, one element across each subarray. Plus it pads with a 0.
How do you split a list into evenly sized chunks? has some good list answers, with various forms of generator or list comprehension, but at first glance I didn't see any that allow for overlaps - though with a clever use of iterators (such as iterator.tee
) that should be possible.
We can blame this on poor question wording, but it is not a duplicate.
Working from the example and the comment:
Here my window size is 3., i.e each splitted list should have 3 elements first split
[1,2,3]
and the step size is 2 , So the second split start should start from 3rd element and 2nd split is [3,4,5] respectively.
Here is an advanced solution using as_strided
In [64]: ast=np.lib.index_tricks.as_strided # shorthand
In [65]: A=np.arange(1,12)
In [66]: ast(A,shape=[5,3],strides=(8,4))
Out[66]:
array([[ 1, 2, 3],
[ 3, 4, 5],
[ 5, 6, 7],
[ 7, 8, 9],
[ 9, 10, 11]])
I increased the range of A
because I didn't want to deal with the 0 pad.
Choosing the target shape
is easy, 5 sets of 3. Choosing the strides requires more knowledge about striding.
In [69]: x.strides
Out[69]: (4,)
The 1d striding, or stepping from one element to the next, is 4 bytes (the length one element). The step from one row to the next is 2 elements of the original, or 2*4 bytes.
as_strided
produces a view. Thus changing an element in it will affect the original, and may change overlapping values. Add .copy()
to make a copy; math with the strided array will also produce a copy.
Changing the strides can give non overlapping rows - but be careful about the shape - it is possible to access values outside of the original data buffer.
In [82]: ast(A,shape=[4,3],strides=(12,4))
Out[82]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 17]])
In [84]: ast(A,shape=[3,3],strides=(16,4))
Out[84]:
array([[ 1, 2, 3],
[ 5, 6, 7],
[ 9, 10, 11]])
edit
A new function gives a safer version of as_strided
.
np.lib.strided_tricks.sliding_window_view(np.arange(1,10),3)[::2]