Improve centering county names ggplot & maps

As I worked this out last night over at Talk Stats (link), it's actually pretty easy (as a product of the hours I spent into the early morning!) if you use the R spatial package (sp). I tested some of their other functions to create a SpatialPolygons object that you can use coordinates on to return a polygon centroid. I only did it for one county, but the label point of a Polygon (S4) object matched the centroid. Assuming this is true, then label points of Polygon objects are centroids. I use this little process to create a data frame of centroids and use them to plot on a map.

library(ggplot2)  # For map_data. It's just a wrapper; should just use maps.
library(sp)
library(maps)
getLabelPoint <- # Returns a county-named list of label points
function(county) {Polygon(county[c('long', 'lat')])@labpt}

df <- map_data('county', 'new york')                 # NY region county data
centroids <- by(df, df$subregion, getLabelPoint)     # Returns list
centroids <- do.call("rbind.data.frame", centroids)  # Convert to Data Frame
names(centroids) <- c('long', 'lat')                 # Appropriate Header

map('county', 'new york')
text(centroids$long, centroids$lat, rownames(centroids), offset=0, cex=0.4)

This will not work well for every polygon. Very often the process of labeling and annotation in GIS requires that you adjust labels and annotation for those peculiar cases that do not fit the automatic (systematic) approach you want to use. The code-look-recode approach we would take to this is not apt. Better to include a check that a label of a given size for the given plot will fit within the polygon; if not, remove it from the record of text labels and manually insert it later to fit the situation--e.g., add a leader line and annotate to the side of the polygon or turn the label sideways as was displayed elsewhere.


This was a very helpful discussion. For the benefit of those who grew up with dplyr, here is a minor tweak, using pipes in place of aggregate:

library(maps); library(dplyr); library(ggplot2)
ny <- map_data('county', 'new york') 

cnames1 <- aggregate(cbind(long, lat) ~ subregion, data=ny, 
                     FUN=function(x)mean(range(x)))
cnames2 <- ny %>% group_by(subregion) %>%
    summarize_at(vars(long, lat), ~ mean(range(.)))

all.equal(cnames1, as.data.frame(cnames2))

I think that the easiest answer to this question is Andrie has already solved the majority of the hand work. The rest needs to be completed with some good ol' adjust and see methods. When you look at the plot after Andrie's suggestion the majority of everything is decent with the exception of some pesky placements that could be improved with a lat/long change or a rotation. I have an example for suffolk (bottom right) and herkimer (center) as suffolk's placement could be improved via a lat/long adjust and herkimer via a rotation.

Before:Before

cnames <- aggregate(cbind(long, lat) ~ subregion, data=ny, 
                    FUN=function(x)mean(range(x))) #Andrie's code

cnames[52, 2:3] <- c(-73, 40.855)  #adjust the long and lat of poorly centered names
cnames$angle <- rep(0, nrow(cnames)) #create an angle column
cnames[22, 4] <- -90    #adjust the angle of atypically shaped

ggplot(ny, aes(long, lat)) +  
    geom_polygon(aes(group=group), colour='black', fill=NA) +
    geom_text(data=cnames, aes(long, lat, label = subregion, colour=col, 
    angle=angle), size=3) + coord_map()

This gives us: enter image description here

Unless someone has a better way I will mark this answer as correct.