How to use a conditional statement based on DataFrame boolean value in pandas

Now I know how to check the dataframe for specific values across multiple columns. However, I cant seem to work out how to carry out an if statement based on a boolean response.

For example:

Walk directories using os.walk and read in a specific file into a dataframe.

for root, dirs, files in os.walk(main):
        filters = '*specificfile.csv'
        for filename in fnmatch.filter(files, filters):
        df = pd.read_csv(os.path.join(root, filename),error_bad_lines=False)

Now checking that dataframe across multiple columns. The first value being the column name (column1), the next value is the specific value I am looking for in that column(banana). I am then checking another column (column2) for a specific value (green). If both of these are true I want to carry out a specific task. However if it is false I want to do something else.

so something like:

if (df['column1']=='banana') & (df['colour']=='green'):
    do something
else: 
    do something

Solution 1:

If you want to check if any row of the DataFrame meets your conditions you can use .any() along with your condition . Example -

if ((df['column1']=='banana') & (df['colour']=='green')).any():

Example -

In [16]: df
Out[16]:
   A  B
0  1  2
1  3  4
2  5  6

In [17]: ((df['A']==1) & (df['B'] == 2)).any()
Out[17]: True

This is because your condition - ((df['column1']=='banana') & (df['colour']=='green')) - returns a Series of True/False values.

This is because in pandas when you compare a series against a scalar value, it returns the result of comparing each row of that series against the scalar value and the result is a series of True/False values indicating the result of comparison of that row with the scalar value. Example -

In [19]: (df['A']==1)
Out[19]:
0     True
1    False
2    False
Name: A, dtype: bool

In [20]: (df['B'] == 2)
Out[20]:
0     True
1    False
2    False
Name: B, dtype: bool

And the & does row-wise and for the two series. Example -

In [18]: ((df['A']==1) & (df['B'] == 2))
Out[18]:
0     True
1    False
2    False
dtype: bool

Now to check if any of the values from this series is True, you can use .any() , to check if all the values in the series are True, you can use .all() .