How to extract certain columns from a list of data frames
Here's an example (this is the kind of thing you should have put in your question. You will get near-instantaneous help if you can structure your question with a clear, copy/pasteable, reproducible example like this.)
Problem:
# list of data frames:
l = list(mtcars, mtcars)
# vector of column names I would like to extract
my_names = c("mpg", "wt", "am")
# these columns might be at different positions in the data frames
Solution:
result = lapply(l, "[", , my_names)
# look at the top 6 rows of each to verify that it worked:
lapply(result, head)
# [[1]]
# mpg wt am
# Mazda RX4 21.0 2.620 1
# Mazda RX4 Wag 21.0 2.875 1
# Datsun 710 22.8 2.320 1
# Hornet 4 Drive 21.4 3.215 0
# Hornet Sportabout 18.7 3.440 0
# Valiant 18.1 3.460 0
#
# [[2]]
# mpg wt am
# Mazda RX4 21.0 2.620 1
# Mazda RX4 Wag 21.0 2.875 1
# Datsun 710 22.8 2.320 1
# Hornet 4 Drive 21.4 3.215 0
# Hornet Sportabout 18.7 3.440 0
# Valiant 18.1 3.460 0
Explanation: You essentially want to do l[[1]][, my_names]
, l[[2]][, my_names]
, ... lapply
applies a function to every list element. In this case, the function is [
, which takes rows as its first argument (we leave it blank to indicate all rows), columns as its second argument (we give it my_names
). It returns the results in a list.
You can use dplyr, it is nice, easy and the syntax is clear:
library(dplyr)
l <- list(mtcars, mtcars) # the list of 2 df
new_list <- lapply(l, function(x) x%>% select(mpg,wt,am))
Ciao!