Replacing Numpy elements if condition is met
I have a large numpy array that I need to manipulate so that each element is changed to either a 1 or 0 if a condition is met (will be used as a pixel mask later). There are about 8 million elements in the array and my current method takes too long for the reduction pipeline:
for (y,x), value in numpy.ndenumerate(mask_data):
if mask_data[y,x]<3: #Good Pixel
mask_data[y,x]=1
elif mask_data[y,x]>3: #Bad Pixel
mask_data[y,x]=0
Is there a numpy function that would speed this up?
Solution 1:
>>> import numpy as np
>>> a = np.random.randint(0, 5, size=(5, 4))
>>> a
array([[4, 2, 1, 1],
[3, 0, 1, 2],
[2, 0, 1, 1],
[4, 0, 2, 3],
[0, 0, 0, 2]])
>>> b = a < 3
>>> b
array([[False, True, True, True],
[False, True, True, True],
[ True, True, True, True],
[False, True, True, False],
[ True, True, True, True]], dtype=bool)
>>>
>>> c = b.astype(int)
>>> c
array([[0, 1, 1, 1],
[0, 1, 1, 1],
[1, 1, 1, 1],
[0, 1, 1, 0],
[1, 1, 1, 1]])
You can shorten this with:
>>> c = (a < 3).astype(int)
Solution 2:
>>> a = np.random.randint(0, 5, size=(5, 4))
>>> a
array([[0, 3, 3, 2],
[4, 1, 1, 2],
[3, 4, 2, 4],
[2, 4, 3, 0],
[1, 2, 3, 4]])
>>>
>>> a[a > 3] = -101
>>> a
array([[ 0, 3, 3, 2],
[-101, 1, 1, 2],
[ 3, -101, 2, -101],
[ 2, -101, 3, 0],
[ 1, 2, 3, -101]])
>>>
See, eg, Indexing with boolean arrays.
Solution 3:
The quickest (and most flexible) way is to use np.where, which chooses between two arrays according to a mask(array of true and false values):
import numpy as np
a = np.random.randint(0, 5, size=(5, 4))
b = np.where(a<3,0,1)
print('a:',a)
print()
print('b:',b)
which will produce:
a: [[1 4 0 1]
[1 3 2 4]
[1 0 2 1]
[3 1 0 0]
[1 4 0 1]]
b: [[0 1 0 0]
[0 1 0 1]
[0 0 0 0]
[1 0 0 0]
[0 1 0 0]]
Solution 4:
You can create your mask array in one step like this
mask_data = input_mask_data < 3
This creates a boolean array which can then be used as a pixel mask. Note that we haven't changed the input array (as in your code) but have created a new array to hold the mask data - I would recommend doing it this way.
>>> input_mask_data = np.random.randint(0, 5, (3, 4))
>>> input_mask_data
array([[1, 3, 4, 0],
[4, 1, 2, 2],
[1, 2, 3, 0]])
>>> mask_data = input_mask_data < 3
>>> mask_data
array([[ True, False, False, True],
[False, True, True, True],
[ True, True, False, True]], dtype=bool)
>>>