Same function over multiple data frames in R

I am new to R, and this is a very simple question. I've found a lot of similar things to what I want but not exactly it. Basically I have multiple data frames and I simply want to run the same function across all of them. A for-loop could work but I'm not sure how to set it up properly to call data frames. It also seems most prefer the lapply approach with R. I've played with the get function as well to no avail. I apologize if this is a duplicated question. Any help would be greatly appreciated!

Here's my over simplified example: 2 data frames: df1, df2

df1
start stop ID
0     10   x
10    20   y
20    30   z

df2
start stop ID
0     10   a
10    20   b
20    30   c

what I want is a 4th column with the average of start and stop for both dfs

df1
start stop ID  Avg
0     10   x    5 
10    20   y    15
20    30   z    25

I can do this one data frame at a time with:

df1$Avg <- rowMeans(subset(df1, select = c(start, stop)), na.rm = TRUE)

but I want to run it on all of the dataframes.

Make a list of data frames then use lapply to apply the function to them all.

df.list <- list(df1,df2,...)
res <- lapply(df.list, function(x) rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE))
# to keep the original data.frame also
res <- lapply(df.list, function(x) cbind(x,"rowmean"=rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE)))

The lapply will then feed in each data frame as x sequentially.

Put them into a list and then run rowMeans over the list.

df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])

lapply(list(df1, df2), function(w) { w$Avg <- rowMeans(w[1:2]); w })

 [[1]]
   x y ID Avg
 1 3 1  a 2.0
 2 3 2  b 2.5
 3 3 3  c 3.0
 4 3 4  d 3.5
 5 3 5  e 4.0

 [[2]]
   x y ID Avg
 1 5 2  f 3.5
 2 5 3  g 4.0
 3 5 4  h 4.5
 4 5 5  i 5.0
 5 5 6  j 5.5

In case you want all the outputs in the same file this may help.

 df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
 df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])

 z=list(df1,df2)
 df=NULL
 for (i in z) {
 i$Avg=(i$x+i$y)/2
 df<-rbind(df,i)
 print (df)
 }

 > df
   x y ID Avg
1  3 1  a 2.0
2  3 2  b 2.5
3  3 3  c 3.0
4  3 4  d 3.5
5  3 5  e 4.0
6  5 2  f 3.5
7  5 3  g 4.0
8  5 4  h 4.5
9  5 5  i 5.0
10 5 6  j 5.5

Same function over multiple data frames in R

Related

Recent Posts