Caret train function for muliple data frames as function

Solution 1:

By writing predictor_iris <- "Species", you are basically saving a string object in predictor_iris. Thus, when you run lda_ex, I guess you incur in some error concerning the formula object in train(), since you are trying to predict a string using vectors of covariates.

Indeed, I tried the following toy example:

X = rnorm(1000)
Y = runif(1000)

predictor = "Y"

lm(predictor ~ X)

which gives an error about differences in the lengths of variables.

Let me modify your function:

lda_ex <- function(data, formula){
  model <- train(formula, data,
                 method = "lda",
                 trControl = trainControl(method = "none"),
                 preProc = c("center","scale"))
  return(model)
}

The key difference is that now we must pass in the whole formula, instead of the predictor only. In that way, we avoid the string-related problem.

library(caret) # Recall to specify the packages needed to reproduce your examples!

data_iris <- iris
formula_iris = Species ~ . # Key difference!
iris_res <- lda_ex(data = data_iris, formula = formula_iris)