Convert Pandas series containing string to boolean

Solution 1:

You can just use map:

In [7]: df = pd.DataFrame({'Status':['Delivered', 'Delivered', 'Undelivered',
                                     'SomethingElse']})

In [8]: df
Out[8]:
          Status
0      Delivered
1      Delivered
2    Undelivered
3  SomethingElse

In [9]: d = {'Delivered': True, 'Undelivered': False}

In [10]: df['Status'].map(d)
Out[10]:
0     True
1     True
2    False
3      NaN
Name: Status, dtype: object

Solution 2:

An example of replace method to replace values only in the specified column C2 and get result as DataFrame type.

import pandas as pd
df = pd.DataFrame({'C1':['X', 'Y', 'X', 'Y'], 'C2':['Y', 'Y', 'X', 'X']})

  C1 C2
0  X  Y
1  Y  Y
2  X  X
3  Y  X

df.replace({'C2': {'X': True, 'Y': False}})

  C1     C2
0  X  False
1  Y  False
2  X   True
3  Y   True

Solution 3:

You've got everything you need. You'll be happy to discover replace:

df.replace(d)

Solution 4:

Expanding on the previous answers:

Map method explained:

  • Pandas will lookup each row's value in the corresponding d dictionary, replacing any found keys with values from d.
  • Values without keys in d will be set as NaN. This can be corrected with fillna() methods.
  • Does not work on multiple columns, since pandas operates through serialization of pd.Series here.
  • Documentation: pd.Series.map
d = {'Delivered': True, 'Undelivered': False}
df["Status"].map(d)

Replace method explained:

  • Pandas will lookup each row's value in the corresponding d dictionary, and attempt to replace any found keys with values from d.
  • Values without keys in d will be be retained.
  • Works with single and multiple columns (pd.Series or pd.DataFrame objects).
  • Documentation: pd.DataFrame.replace
d = {'Delivered': True, 'Undelivered': False}
df["Status"].replace(d)

Overall, the replace method is more robust and allows finer control over how data is mapped + how to handle missing or nan values.