How to resolve GLMM residuals pattern , spatial correlation and zero distance error?
tl;dr the spatial pattern can be well explained by a spatial trend (x-y plane), you don't need an autocorrelation model. (If you did you might also try fitting this model in mgcv
.)
Packages:
library(glmmTMB)
library(DHARMa)
library(ggplot2); theme_set(theme_bw())
library(tidyverse)
library(colorspace)
I plotted the data (not reproduced here, it didn't show anything amazing)
ggplot(f, aes(intensity, y, shape=YEAR, fill = SITES)) +
geom_boxplot() + facet_wrap(~YEAR)
There are 15 sites, thus only 15 unique lat/long positions. Let's plot the average values of the residuals by site, in space (there are other ways to do this, but augment
is convenient)
a <- broom.mixed::augment(model, data = f)
aa <- (a
%>% group_by(long, lat)
%>% summarize(across(.resid, mean), .groups = "drop")
)
ggplot(aa, aes(long, lat, colour = .resid),
size = abs(.resid))) +
geom_point() +
scale_color_continuous_diverging() +
scale_size(range = c(3, 8))
Plotting with sign(.resid)
instead makes the pattern even clearer:
almost all the negative residuals are in the SW, almost all the positive residuals are in the NE.
Solution: add lat
and long
to the model as predictors.
model2 <- update(model, . ~. + lat + long)
res2 <- simulateResiduals(model2)
res3 <- recalculateResiduals(res2, group = f$SITES)
locs <- f %>% group_by(SITES) %>% summarise(across(c(lat, long), mean))
testSpatialAutocorrelation(res3, x = locs$long, y = locs$lat)
This takes care of the significant autocorrelation, and makes the overall diagnostics look better.