Why using an array as an index changes the shape of a multidimensional ndarray?
I have a 4-D NumPy array, with axis say x,y,z,t. I want to take slice corresponding to t=0 and to permute the order in the y axis.
I have the following
import numpy as np
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
b.shape
I get (5, 4, 3) instead of (4,5,3).
When, instead, I enter
aa = a[:,:,:,0]
bb = aa[:,[1,2,3,4,0],:]
bb.shape
I get the expected (4,5,3). Can someone explain why does the first version swap the first two dimensions?
Solution 1:
As @hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
In the above code snippet, what happens is the following:
- when we do basic slicing along last dimension, it triggers a
__getitem__
call. So, that dimension is gone. (i.e. no singleton dimension) [1,2,3,4,0]
returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...
) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.Along first and third dimension, since you slice everything using
:
, the original length along those dimensions is retained.
Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)
You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing