How to GeoCode a simple address using Data Science Toolbox
Like this:
library(httr)
library(rjson)
data <- paste0("[",paste(paste0("\"",dff$address,"\""),collapse=","),"]")
url <- "http://www.datasciencetoolkit.org/street2coordinates"
response <- POST(url,body=data)
json <- fromJSON(content(response,type="text"))
geocode <- do.call(rbind,sapply(json,
function(x) c(long=x$longitude,lat=x$latitude)))
geocode
# long lat
# San Francisco, California, United States -117.88536 35.18713
# Mobile, Alabama, United States -88.10318 30.70114
# La Jolla, California, United States -117.87645 33.85751
# Duarte, California, United States -118.29866 33.78659
# Little Rock, Arkansas, United States -91.20736 33.60892
# Tucson, Arizona, United States -110.97087 32.21798
# Redwood City, California, United States -117.88536 35.18713
# New Haven, Connecticut, United States -72.92751 41.36571
# Berkeley, California, United States -122.29673 37.86058
# Hartford, Connecticut, United States -72.76356 41.78516
# Sacramento, California, United States -121.55541 38.38046
# Encinitas, California, United States -116.84605 33.01693
# Birmingham, Alabama, United States -86.80190 33.45641
# Stanford, California, United States -122.16750 37.42509
# Orange, California, United States -117.85311 33.78780
# Los Angeles, California, United States -117.88536 35.18713
This takes advantage of the POST interface to the street2coordinates API (documented here), which returns all the results in 1 request, rather than using multiple GET requests.
The absence of Phoenix seems to be a bug in the street2coordinates API. If you go the API demo page and try "Phoenix, Arizona, United States", you get a null response. However, as your example shows, using their "Google-style Geocoder" does give a result for Phoenix. So here's a solution using repeated GET requests. Note that this runs much slower.
geo.dsk <- function(addr){ # single address geocode with data sciences toolkit
require(httr)
require(rjson)
url <- "http://www.datasciencetoolkit.org/maps/api/geocode/json"
response <- GET(url,query=list(sensor="FALSE",address=addr))
json <- fromJSON(content(response,type="text"))
loc <- json['results'][[1]][[1]]$geometry$location
return(c(address=addr,long=loc$lng, lat= loc$lat))
}
result <- do.call(rbind,lapply(as.character(dff$address),geo.dsk))
result <- data.frame(result)
result
# address long lat
# 1 Birmingham, Alabama, United States -86.801904 33.456412
# 2 Mobile, Alabama, United States -88.103184 30.701142
# 3 Phoenix, Arizona, United States -112.0733333 33.4483333
# 4 Tucson, Arizona, United States -110.970869 32.217975
# 5 Little Rock, Arkansas, United States -91.207356 33.608922
# 6 Berkeley, California, United States -122.29673 37.860576
# 7 Duarte, California, United States -118.298662 33.786594
# 8 Encinitas, California, United States -116.846046 33.016928
# 9 La Jolla, California, United States -117.876447 33.857515
# 10 Los Angeles, California, United States -117.885359 35.187133
# 11 Orange, California, United States -117.853112 33.787795
# 12 Redwood City, California, United States -117.885359 35.187133
# 13 Sacramento, California, United States -121.555406 38.380456
# 14 San Francisco, California, United States -117.885359 35.187133
# 15 Stanford, California, United States -122.1675 37.42509
# 16 Hartford, Connecticut, United States -72.763564 41.78516
# 17 New Haven, Connecticut, United States -72.927507 41.365709
The ggmap package includes support for geocoding using either Google or Data Science Toolkit, the latter with their "Google-style geocoder". This is quite slow for multiple addresses, as noted in the earlier answer.
library(ggmap)
result <- geocode(as.character(dff[[1]]), source = "dsk")
print(cbind(dff, result))
# address lon lat
# 1 Birmingham, Alabama, United States -86.80190 33.45641
# 2 Mobile, Alabama, United States -88.10318 30.70114
# 3 Phoenix, Arizona, United States -112.07404 33.44838
# 4 Tucson, Arizona, United States -110.97087 32.21798
# 5 Little Rock, Arkansas, United States -91.20736 33.60892
# 6 Berkeley, California, United States -122.29673 37.86058
# 7 Duarte, California, United States -118.29866 33.78659
# 8 Encinitas, California, United States -116.84605 33.01693
# 9 La Jolla, California, United States -117.87645 33.85751
# 10 Los Angeles, California, United States -117.88536 35.18713
# 11 Orange, California, United States -117.85311 33.78780
# 12 Redwood City, California, United States -117.88536 35.18713
# 13 Sacramento, California, United States -121.55541 38.38046
# 14 San Francisco, California, United States -117.88536 35.18713
# 15 Stanford, California, United States -122.16750 37.42509
# 16 Hartford, Connecticut, United States -72.76356 41.78516
# 17 New Haven, Connecticut, United States -72.92751 41.36571