Pass a vector of variables into lm() formula

I was trying to automate a piece of my code so that programming become less tedious.

Basically I was trying to do a stepwise selection of variables using fastbw() in the rms package. I would like to pass the list of variables selected by fastbw() into a formula as y ~ x1+x2+x3, "x1" "x2" "x3" being the list of variables selected by fastbw()

Here is the code I tried and did not work

olsOAW0.r060 <- ols(roll_pct~byoy+trans_YoY+change18m, 
                    subset= helper=="POPNOAW0_r060", 
                    na.action = na.exclude, 
                    data = modelready)

OAW0 <- fastbw(olsOAW0.r060, rule="p", type="residual", sls= 0.05)

vec <- as.vector(OAW0$names.kept, mode="any")

b <- paste(vec, sep ="+") ##I even tried b <- paste(OAW0$names.kept, sep="+")

bestp.OAW0.r060 <- lm(roll_pct ~ b , 
                      data = modelready, 
                      subset = helper =="POPNOAW0_r060",    
                      na.action = na.exclude)

I am new to R and still haven't trailed the steep learning curve, so apologize for obvious programming blunders.


You're almost there. You just have to paste the entire formula together, something like this:

paste("roll_pct ~ ",b,sep = "")

coerce it to an actual formula using as.formula and then pass that to lm. Technically, I think lm may coerce a character string itself, but coercing it yourself is generally safer. (Some functions that expect formulas won't do the coercion for you, others will.)


You would actually need to use collapse instead of seb when defining b.

b <- paste(OAW0$names.kept, collapse="+")

Then you can put it in joran answer

paste("roll_pct ~ ",b,sep = "")

or just use:

paste("roll_pct ~ ",paste(OAW0$names.kept, collapse="+"),sep = "")

I ran into similar issue today, if you want to make it even more generic where you don't even have to have fixed class name, you can use

frmla <- as.formula(paste(colnames(modelready)[1], paste(colnames(modelready)[2:ncol(modelready)], sep = "", 
                              collapse = " + "), sep = " ~ "))

This assumes that you have class variable or the dependent variable in the first column but indexing can be easily switched to last column as:

frmla <- as.formula(paste(colnames(modelready)[ncol(modelready)], paste(colnames(modelready)[1:(ncol(modelready)-1)], sep = "", 
                              collapse = " + "), sep = " ~ "))

Then continue with lm using:

bestp.OAW0.r060 <- lm(frmla , data = modelready, ... )

If you're looking for something less verbose:

fm <- as.formula( paste( colnames(df)[i], ".", sep=" ~ ")) 
                                      # i is the index of the outcome column

Here it is in a function:

getFormula<-function(target, df) {

  i <- grep(target,colnames(df))
  as.formula(paste(colnames(df)[i], 
                   ".", 
                   sep = " ~ "))
}
fm <- getFormula("myOutcomeColumnName", myDataFrame)
rp <- rpart(fm, data = myDataFrame) # Use the formula to build a model