Download .csv file from github using HTTR GET request
I am trying to create an automatic pull in R using the GET function from the HTTR package for a csv file located on github.
Here is the table I am trying to download.
https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv
I can make the connection to the file using the following GET request:
library(httr)
x <- httr::GET("https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
However I am unsure how I then convert that into a dataframe similar to the table on github.
Any assistance would be much appreciated.
Solution 1:
I am new to R but here is my solution.
You need to use the raw version of the csv file from github (raw.githubusercontent.com)!
library(httr)
x <- httr::GET("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv")
# Save to file
bin <- content(x, "raw")
writeBin(bin, "data.csv")
# Read as csv
dat = read.csv("data.csv", header = TRUE, dec = ",")
colnames(dat) = gsub("X", "", colnames(dat))
# Group by country name (to sum regions)
# Skip the four first columns containing metadata
countries = aggregate(dat[, 5:ncol(dat)], by=list(Country.Region=dat$Country.Region), FUN=sum)
# Here is the table of the most recent total confirmed cases
countries_total = countries[, c(1, ncol(countries))]
The output graph
How I got this to work:
- How to sum a variable by group