Str attribute error when using .apply() on a pandas series

I have a df as follows:

data = {'Internal':  ['unpaid second interator 4,000 USD and $50 third', '35 $ unpaid', "all good"],
        'Name': ['Charlie', 'Rodolf', 'samuel']}

df = pd.DataFrame(data)

print (df)

I would like to run this formula with .apply():

def unspent(row):
    if row['Internal'].str.contains('unspent',case=False)==True:
        val="unspent in text"
    else:
        val="unspent is NOT in text"
    return val

& get a table with an additional column:


df['Unspent']=df.apply(unspent,axis=1)

but I get an error instead:

AttributeError: 'str' object has no attribute 'str'

I have tried ommiting .str. in the formula def unspent and get another error:

AttributeError: 'str' object has no attribute 'contains'

Solution 1:

The problem is, you're trying to use str.contains which is a pandas Series method on a string (because row['Internal'] is a string for every row)

What you can do is either replace

if row['Internal'].str.contains('unspent',case=False)==True:

with

if 'unspent' in row['Internal']:

in your function or use str.contains on the df['Internal'] column to create a boolean series and use np.where to select values:

df['Unspent'] = np.where(df['Internal'].str.contains('unspent', case=False), "unspent in text", "unspent is NOT in text")

Output:

                                 Internal     Name  \
0  unpaid second interator 4,000 USD and $50 third  Charlie   
1                                      35 $ unpaid   Rodolf   
2                                         all good   samuel   

                  Unspent  
0  unspent is NOT in text  
1  unspent is NOT in text  
2  unspent is NOT in text