Find integer index of rows with NaN in pandas dataframe
I have a pandas DataFrame like this:
a b
2011-01-01 00:00:00 1.883381 -0.416629
2011-01-01 01:00:00 0.149948 -1.782170
2011-01-01 02:00:00 -0.407604 0.314168
2011-01-01 03:00:00 1.452354 NaN
2011-01-01 04:00:00 -1.224869 -0.947457
2011-01-01 05:00:00 0.498326 0.070416
2011-01-01 06:00:00 0.401665 NaN
2011-01-01 07:00:00 -0.019766 0.533641
2011-01-01 08:00:00 -1.101303 -1.408561
2011-01-01 09:00:00 1.671795 -0.764629
Is there an efficient way to find the "integer" index of rows with NaNs? In this case the desired output should be [3, 6]
.
Here is a simpler solution:
inds = pd.isnull(df).any(1).nonzero()[0]
In [9]: df
Out[9]:
0 1
0 0.450319 0.062595
1 -0.673058 0.156073
2 -0.871179 -0.118575
3 0.594188 NaN
4 -1.017903 -0.484744
5 0.860375 0.239265
6 -0.640070 NaN
7 -0.535802 1.632932
8 0.876523 -0.153634
9 -0.686914 0.131185
In [10]: pd.isnull(df).any(1).nonzero()[0]
Out[10]: array([3, 6])
For DataFrame df
:
import numpy as np
index = df['b'].index[df['b'].apply(np.isnan)]
will give you back the MultiIndex
that you can use to index back into df
, e.g.:
df['a'].ix[index[0]]
>>> 1.452354
For the integer index:
df_index = df.index.values.tolist()
[df_index.index(i) for i in index]
>>> [3, 6]