Adding a new column to each element in a list of tables or data frames

I have a list of files. I also have a list of "names" which I substr() from the actual filenames of these files. I would like to add a new column to each of the files in the list. This column will contain the corresponding element in "names" repeated times the number of rows in the file.

For example:

df1 <- data.frame(x = 1:3, y=letters[1:3])
df2 <- data.frame(x = 4:6, y=letters[4:6])
filelist <- list(df1,df2)
ID <- c("1A","IB")

Pseudocode

  for( i in length(filelist)){

       filelist[i]$SampleID <- rep(ID[i],nrow(filelist[i])

  }

// basically create a new column in each of the dataframes in filelist, and fill the column with repeted corresponding values of ID

my output should be like:

filelist[1] should be:

   x y SAmpleID
 1 1 a       1A
 2 2 b       1A
 3 3 c       1A

fileList[2]

   x y SampleID
 1 4 d       IB
 2 5 e       IB
 3 6 f       IB

and so on.....

Any Idea how it could be done.


An alternate solution is to use cbind, and taking advantage of the fact that R will recylce values of a shorter vector.

For Example

x <- df2  # from above
cbind(x, NewColumn="Singleton")
 #    x y NewColumn
 #  1 4 d Singleton
 #  2 5 e Singleton
 #  3 6 f Singleton

There is no need for the use of rep. R does that for you.

Therfore, you could put cbind(filelist[[i]], ID[[i]]) in your for loop or as @Sven pointed out, you can use the cleaner mapply:

filelist <- mapply(cbind, filelist, "SampleID"=ID, SIMPLIFY=F)

This is a corrected version of your loop:

for( i in seq_along(filelist)){

  filelist[[i]]$SampleID <- rep(ID[i],nrow(filelist[[i]]))

}

There were 3 problems:

  • A final ) was missing after the command in the body.
  • Elements of lists are accessed by [[, not by [. [ returns a list of length one. [[ returns the element only.
  • length(filelist) is just one value, so the loop runs for the last element of the list only. I replaced it with seq_along(filelist).

A more efficient approach is to use mapply for the task:

mapply(function(x, y) "[<-"(x, "SampleID", value = y) ,
       filelist, ID, SIMPLIFY = FALSE)

The purrr way, using map2

library(dplyr)
library(purrr)

map2(filelist, ID, ~cbind(.x, SampleID = .y))

#[[1]]
#  x y SampleId
#1 1 a       1A
#2 2 b       1A
#3 3 c       1A

#[[2]]
#  x y SampleId
#1 4 d       IB
#2 5 e       IB
#3 6 f       IB

Or can also use

map2(filelist, ID, ~.x %>% mutate(SampleId = .y))

If you name the list, we can use imap and add the new column based on it's name.

names(filelist) <- c("1A","IB")
imap(filelist, ~cbind(.x, SampleID = .y))
#OR
#imap(filelist, ~.x %>% mutate(SampleId = .y))

which is similar to using Map

Map(cbind, filelist, SampleID = names(filelist))