How to replace dataframe text column with only the 1st occuring word / words before a comma
The dataframe for the problem statement looks like
Name | UID | search_text |
---|---|---|
B | 14 | kj |
S | 2 | hsa,isd |
D | 10 | sa,ad,ad |
E | 99 | pid, pd,dd,ef |
G | 8 | dd |
I want the dataframe search_text to be stripped and replaced on the 1st word before comma.(I dont want to manually map it and replace). So it would look like.
Name | UID | search_text |
---|---|---|
B | 14 | kj |
S | 2 | hsa |
D | 10 | sa |
E | 99 | pid |
G | 8 | dd |
Is there any convenient way to do that?
Extract the first alphanumerics in the string
df['search_text'] = df['search_text'].str.extract('(^\w+)')
Name UID search_text
0 B 14 kj
1 S 2 hsa
2 D 10 sa
3 E 99 pid
4 G 8 dd
Use Series.str.split
df['search_text'] = df['search_text'].str.split(',').str[0]
print(df)
Name UID search_text
0 B 14 kj
1 S 2 hsa
2 D 10 sa
3 E 99 pid
4 G 8 dd