How to translate "bytes" objects into literal strings in pandas Dataframe, Python3.x?
Solution 1:
You can use vectorised str.decode
to decode byte strings into ordinary strings:
df['COLUMN1'].str.decode("utf-8")
To do this for multiple columns you can select just the str columns:
str_df = df.select_dtypes([np.object])
convert all of them:
str_df = str_df.stack().str.decode('utf-8').unstack()
You can then swap out converted cols with the original df cols:
for col in str_df:
df[col] = str_df[col]
Solution 2:
Combining the answers by @EdChum and @Yu Zhou, a simpler solution would be:
for col, dtype in df.dtypes.items():
if dtype == np.object: # Only process byte object columns.
df[col] = df[col].apply(lambda x: x.decode("utf-8"))