How to count values in a certain range in a Numpy array?
If your array is called a
, the number of elements fulfilling 25 < x < 100
is
((25 < a) & (a < 100)).sum()
The expression (25 < a) & (a < 100)
results in a Boolean array with the same shape as a
with the value True
for all elements that satisfy the condition. Summing over this Boolean array treats True
values as 1
and False
values as 0
.
You could use histogram
. Here's a basic usage example:
>>> import numpy
>>> a = numpy.random.random(size=100) * 100
>>> numpy.histogram(a, bins=(0.0, 7.3, 22.4, 55.5, 77, 79, 98, 100))
(array([ 8, 14, 34, 31, 0, 12, 1]),
array([ 0. , 7.3, 22.4, 55.5, 77. , 79. , 98. , 100. ]))
In your particular case, it would look something like this:
>>> numpy.histogram(a, bins=(25, 100))
(array([73]), array([ 25, 100]))
Additionally, when you have a list of strings, you have to explicitly specify the type, so that numpy
knows to produce an array of floats instead of a list of strings.
>>> strings = [str(i) for i in range(10)]
>>> numpy.array(strings)
array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'],
dtype='|S1')
>>> numpy.array(strings, dtype=float)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
Building on Sven's good approach, you can also do the slightly more explicit:
numpy.count_nonzero((25 < a) & (a < 100))
This first creates an array of booleans with one boolean for each input number in array a
, and then count the number of non-False (i.e. True) values (which gives the number of matching numbers).
Note, however, that this approach is twice as slow as Sven's .sum()
approach, on an array of 100k numbers (NumPy 1.6.1, Python 2.7.3)–about 300 µs versus 150 µs.
Sven's answer is the way to do it if you don't wish to further process matching values.
The following two examples return copies with only the matching values:
np.compress((25 < a) & (a < 100), a).size
Or:
a[(25 < a) & (a < 100)].size
Example interpreter session:
>>> import numpy as np
>>> a = np.random.randint(200,size=100)
>>> a
array([194, 131, 10, 100, 199, 123, 36, 14, 52, 195, 114, 181, 138,
144, 70, 185, 127, 52, 41, 126, 159, 39, 68, 118, 124, 119,
45, 161, 66, 29, 179, 194, 145, 163, 190, 150, 186, 25, 61,
187, 0, 69, 87, 20, 192, 18, 147, 53, 40, 113, 193, 178,
104, 170, 133, 69, 61, 48, 84, 121, 13, 49, 11, 29, 136,
141, 64, 22, 111, 162, 107, 33, 130, 11, 22, 167, 157, 99,
59, 12, 70, 154, 44, 45, 110, 180, 116, 56, 136, 54, 139,
26, 77, 128, 55, 143, 133, 137, 3, 83])
>>> np.compress((25 < a) & (a < 100),a).size
34
>>> a[(25 < a) & (a < 100)].size
34
The above examples use a "bit-wise and" (&) to do an element-wise computation along the two boolean arrays which you create for comparison purposes.
Another way to write Sven's excellent answer, for example, is:
np.bitwise_and(25 < a, a < 100).sum()
The boolean arrays contain True
values when the condition matches, and False
when it doesn't.
A bonus aspect of boolean values is that True
is equivalent to 1 and False
to 0.