Why using an array as an index changes the shape of a multidimensional ndarray?

I have a 4-D NumPy array, with axis say x,y,z,t. I want to take slice corresponding to t=0 and to permute the order in the y axis.

I have the following

import numpy as np
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
b.shape

I get (5, 4, 3) instead of (4,5,3).

When, instead, I enter

aa = a[:,:,:,0]
bb = aa[:,[1,2,3,4,0],:]
bb.shape

I get the expected (4,5,3). Can someone explain why does the first version swap the first two dimensions?


Solution 1:

As @hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:

a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]

In the above code snippet, what happens is the following:

  • when we do basic slicing along last dimension, it triggers a __getitem__ call. So, that dimension is gone. (i.e. no singleton dimension)
  • [1,2,3,4,0] returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.

  • Along first and third dimension, since you slice everything using :, the original length along those dimensions is retained.

Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)

You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing