Converting subarray index into original array index with Numpy
In this code snippet:
import numpy as np
def f(b, i):
# calculate original index
return j
a = np.random.rand((N, M))
b = a[:, m]
j = f(b, i)
assert b[i] == a[j]
I would like the function f
to find an index, which satisfies the assertion at the last line. Indexing with j
doesn't have to have a[j]
syntax.
Solution 1:
Consider a sample array:
In [213]: a = np.arange(12).reshape(3,4)
In [214]: a
Out[214]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
And b
from that:
In [215]: b = a[:,2]
In [216]: b
Out[216]: array([ 2, 6, 10])
In [217]: b[1]
Out[217]: 6
All the numpy
knows about a
is in:
In [218]: a.__array_interface__
Out[218]:
{'data': (51087664, False),
'strides': None, # a.strides is (32,8)
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3, 4),
'version': 3}
b
is a view
of a
, with the corresponding:
In [219]: b.__array_interface__
Out[219]:
{'data': (51087680, False),
'strides': (32,),
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3,),
'version': 3}
the base
for b
is the original arange
:
In [221]: b.base
Out[221]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [222]: b.base.__array_interface__
Out[222]:
{'data': (51087664, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (12,),
'version': 3}
Your f
could compare the data
attribute of b
with its base to get the offset
, where b
"starts"
(51087680-51087664)/8
So b
starts at
In [223]: (51087680-51087664)/8
Out[223]: 2.0
In [224]: b[0]
Out[224]: 2
In [225]: b.base[2]
Out[225]: 2
In [226]: a.ravel()[2]
Out[226]: 2
We can guess/deduce that since the strides of b
is (32,), and dtype
is i8
, that the other strides of a
is 8.
b[1]
will be 32/8 beyond its start, or 4.
In [227]: b[1]
Out[227]: 6
In [228]: b.base[2+4]
Out[228]: 6
If we also deduce that a
shape is (3,4) (deduce the 4
for base length 12 and b
shape of (3,)):
In [229]: np.unravel_index(6,(3,4))
Out[229]: (1, 2)
In [230]: a[1,2]
Out[230]: 6
I'll let you clean things up and decide for yourself whether these deductions and calculations are robust enough for your purposes.
alt a
If a
is its own base
(not a view
of something else), b.base
will be a
itself, and we don't have to make deductions about its strides
and shape
:
In [231]: a = np.arange(12).reshape(3,4).copy()
In [232]: a.__array_interface__
Out[232]:
{'data': (51778608, False),
'strides': None,
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3, 4),
'version': 3}
In [233]: b = a[:,2]
In [234]: b.__array_interface__
Out[234]:
{'data': (51778624, False),
'strides': (32,),
'descr': [('', '<i8')],
'typestr': '<i8',
'shape': (3,),
'version': 3}
In [235]: b.base
Out[235]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
There was similar question recently, and someone went to the work of packaging these calculations in a function. I don't have a link to that, but it shouldn't be hard to find. In any case, there isn't a simple numpy
function call that will do this for you.
Here's the previous SO
given the index of an item in a view of a numpy array, find its index in the base array