How to sum up the values in a column with Python when the column has multiple data types
I want to add all the numbers in a specific column together and my current code throws an error because the column also contains non-numeric data. For example, the data looks like this:
Values
NaN
NaN
0
NoData
无
1200
NaN
2300
In this case, I just need the sum of 1200 and 2300 but the current code generates an error because the column also contains NaN and Strings, both English and Foreign characters. How do I update the code to resolve the issue?
summary = 0
for i in range(c+1, b-2):
if not pd.isna(df.loc[i]['Values']) or not isinstance(df.loc[i]['Values'], str):
item_sum += df.loc[i]['Values']
The current data type of the Value column is "object".
Thanks.
You can use this without looping:
pd.to_numeric(df['Values'], errors='coerce').sum()
Output:
3500.0
If series contains 'INF', the what do you want to happen? Intrepet INF as infinite, do same.
pd.to_numeric(df['Values'], errors='coerce').sum()
Output:
inf
or ignore 'INF' then use replace like @BrendanA suggests:
pd.to_numeric(df['Values'].replace('INF', np.nan), errors='coerce').sum()
Output:
3500.00
Your logic is wrong. You are checking if the value is not NaN
or the value is not a str
, which will always be true because there is no value that is both NaN
and a str
. Switch to and
.