Object not found error when passing model formula to another function

Solution 1:

When you created your formula, lm.cars, in was assigned its own environment. This environment stays with the formula unless you explicitly change it. So when you extract the formula with the formula function, the original environment of the model is included.

I don't know if I'm using the correct terminology here, but I think you need to explicitly change the environment for the formula inside your function:

cv.step <- function(linmod,k=10,direction="both"){
  response <- linmod$y
  dmatrix <- linmod$x
  n <- length(response)
  datas <- linmod$model
  .env <- environment() ## identify the environment of cv.step

  ## extract the formula in the environment of cv.step
  form <- as.formula(linmod$call, env = .env) 

  ## The rest of your function follows

Solution 2:

Another problem that can cause this is if one passes a character (string vector) to lm instead of a formula. vectors have no environment, and so when lm converts the character to a formula, it apparently also has no environment instead of being automatically assigned the local environment. If one then uses an object as weights that is not in the data argument data.frame, but is in the local function argument, one gets a not found error. This behavior is not very easy to understand. It is probably a bug.

Here's a minimal reproducible example. This function takes a data.frame, two variable names and a vector of weights to use.

residualizer = function(data, x, y, wtds) {
  #the formula to use
  f = "x ~ y" 

  #residualize
  resid(lm(formula = f, data = data, weights = wtds))
}

residualizer2 = function(data, x, y, wtds) {
  #the formula to use
  f = as.formula("x ~ y")

  #residualize
  resid(lm(formula = f, data = data, weights = wtds))
}

d_example = data.frame(x = rnorm(10), y = rnorm(10))
weightsvar = runif(10)

And test:

> residualizer(data = d_example, x = "x", y = "y", wtds = weightsvar)
Error in eval(expr, envir, enclos) : object 'wtds' not found

> residualizer2(data = d_example, x = "x", y = "y", wtds = weightsvar)
         1          2          3          4          5          6          7          8          9         10 
 0.8986584 -1.1218003  0.6215950 -0.1106144  0.1042559  0.9997725 -1.1634717  0.4540855 -0.4207622 -0.8774290 

It is a very subtle bug. If one goes into the function environment with browser, one can see the weights vector just fine, but it somehow is not found in the lm call!

The bug becomes even harder to debug if one used the name weights for the weights variable. In this case, since lm can't find the weights object, it defaults to the function weights() from base thus throwing an even stranger error:

Error in model.frame.default(formula = f, data = data, weights = weights,  : 
  invalid type (closure) for variable '(weights)'

Don't ask me how many hours it took me to figure this out.