Extract int from string in Pandas
Lets say I have a dataframe df
as
A B
1 V2
3 W42
1 S03
2 T02
3 U71
I want to have a new column (either at it the end of df
or replace column B
with it, as it doesn't matter) that only extracts the int from the column B
. That is I want column C
to look like
C
2
42
3
2
71
So if there is a 0 in front of the number, such as for 03, then I want to return 3 not 03
How can I do this?
Solution 1:
You can convert to string and extract the integer using regular expressions.
df['B'].str.extract('(\d+)').astype(int)
Solution 2:
Assuming there is always exactly one leading letter
df['B'] = df['B'].str[1:].astype(int)
Solution 3:
First set up the data
df = pd.DataFrame({'A': [1, 3, 1, 2, 3], 'B' : ['V2', 'W42', 'S03', 'T02', 'U71']})
df.head()
Then do the extraction and cast it back to ints
df['C'] = df['B'].str.extract('(\d+)').astype(int)
df.head()