Selecting a subset of columns in a data.table
I'd like to print all the columns of a data table dt
except one of them named V3
but don't want to refer to it by number but by name. This is the code that I have:
dt = data.table(matrix(sample(c(0,1),5,rep=T),50,10))
dt[,-3,with=FALSE] # Is this the only way to not print column "V3"?
Using the data frame
way, one could do this through the code:
df = data.frame(matrix(sample(c(0,1),5,rep=T),50,10))
df[,!(colnames(df)%in% c("X3"))]
So, my question is: is there another way to not print one column in a data table without the necessity of refer to it by number? I'd like to find something similar to the data frame syntax I used above but using data table.
Solution 1:
Use a very similar syntax as for a data.frame
, but add the argument with=FALSE
:
dt[, setdiff(colnames(dt),"V9"), with=FALSE]
V1 V2 V3 V4 V5 V6 V7 V8 V10
1: 1 1 1 1 1 1 1 1 1
2: 0 0 0 0 0 0 0 0 0
3: 1 1 1 1 1 1 1 1 1
4: 0 0 0 0 0 0 0 0 0
5: 0 0 0 0 0 0 0 0 0
6: 1 1 1 1 1 1 1 1 1
The use of with=FALSE
is nicely explained in the documentation for the j
argument in ?data.table
:
j: A single column name, single expresson of column names, list()
of expressions of column names, an expression or function call that evaluates to list (including data.frame
and data.table
which are lists, too), or (when with=FALSE
) same as j in [.data.frame
.
From v1.10.2 onwards it is also possible to do this as follows:
keep <- setdiff(names(dt), "V9")
dt[, ..keep]
Prefixing a symbol with ..
will look up in calling scope (i.e. the Global Environment) and its value taken to be column names or numbers (source).
Solution 2:
Edit 2019-09-27 with a more modern approach
You can do this with patterns
as mentioned above; or, you can do it with !
if there's a vector of names already:
dt[ , !'V3']
# or
drop_cols = 'V3'
dt[ , !..drop_cols]
..
means "look up one level"
Older version using with=FALSE
(data.table
is moving away from this argument steadily)
Here's a way that uses grep
to convert to numeric and allow negative column indexing:
dt[, -grep("^V3$", names(dt)), with=FALSE]
You did say "V3" was to be excluded, right?