Is there a difference between the R functions fitted() and predict()?

Yes, there is. If there is a link function relating the linear predictor to the expected value of the response (such as log for Poisson regression or logit for logistic regression), predict returns the fitted values before the inverse of the link function is applied (to return the data to the same scale as the response variable), and fitted shows it after it is applied.

For example:

x = rnorm(10)
y = rpois(10, exp(x))
m = glm(y ~ x, family="poisson")

print(fitted(m))
#         1         2         3         4         5         6         7         8 
# 0.3668989 0.6083009 0.4677463 0.8685777 0.8047078 0.6116263 0.5688551 0.4909217 
#         9        10 
# 0.5583372 0.6540281 
print(predict(m))
#          1          2          3          4          5          6          7 
# -1.0026690 -0.4970857 -0.7598292 -0.1408982 -0.2172761 -0.4916338 -0.5641295 
#          8          9         10 
# -0.7114706 -0.5827923 -0.4246050 
print(all.equal(log(fitted(m)), predict(m)))
# [1] TRUE

This does mean that for models created by linear regression (lm), there is no difference between fitted and predict.

In practical terms, this means that if you want to compare the fit to the original data, you should use fitted.

The fitted function returns the y-hat values associated with the data used to fit the model. The predict function returns predictions for a new set of predictor variables. If you don't specify a new set of predictor variables then it will use the original data by default giving the same results as fitted for some models, but if you want to predict for a new set of values then you need predict. The predict function often also has options for which type of prediction to return, the linear predictor, the prediction transformed to the response scale, the most likely category, the contribution of each term in the model, etc.

Is there a difference between the R functions fitted() and predict()?

Related

Recent Posts