How to add the total of a specific set of rows of a column, and then add this to another dataset?
Solution 1:
You can achieve the result using dplyr
library in R.
First, you'll need to group the data using the location
variable and then summarise the column of new_cases
.
The code will look like this:
df <- df %>%
group_by(location) %>%
summarise(totalCases = sum(new_cases))
df
The output will look like this:
# A tibble: 238 x 2
location totalCases
<chr> <dbl>
1 Afghanistan 158602
2 Africa 10230722
3 Albania NA
4 Algeria 224383
5 Andorra NA
6 Angola 92581
7 Anguilla NA
8 Antigua and Barbuda NA
9 Argentina NA
10 Armenia NA
# ... with 228 more rows
Note: This will give you totalCases
for every location.
To get it for a specific location, you can use filter
.
df2 <- df %>%
filter(location == "Afghanistan") %>%
group_by(location) %>%
summarise(totalCases = sum(new_cases))
df2
Output:
# A tibble: 1 x 2
location totalCases
<chr> <dbl>
1 Afghanistan 158602
Since it is stored in a new df called df2
, you can merge the data with another df of your choice.
You can find the documentation here.