Remove an entire column from a data.frame in R
Solution 1:
You can set it to NULL
.
> Data$genome <- NULL
> head(Data)
chr region
1 chr1 CDS
2 chr1 exon
3 chr1 CDS
4 chr1 exon
5 chr1 CDS
6 chr1 exon
As pointed out in the comments, here are some other possibilities:
Data[2] <- NULL # Wojciech Sobala
Data[[2]] <- NULL # same as above
Data <- Data[,-2] # Ian Fellows
Data <- Data[-2] # same as above
You can remove multiple columns via:
Data[1:2] <- list(NULL) # Marek
Data[1:2] <- NULL # does not work!
Be careful with matrix-subsetting though, as you can end up with a vector:
Data <- Data[,-(2:3)] # vector
Data <- Data[,-(2:3),drop=FALSE] # still a data.frame
Solution 2:
To remove one or more columns by name, when the column names are known (as opposed to being determined at run-time), I like the subset()
syntax. E.g. for the data-frame
df <- data.frame(a=1:3, d=2:4, c=3:5, b=4:6)
to remove just the a
column you could do
Data <- subset( Data, select = -a )
and to remove the b
and d
columns you could do
Data <- subset( Data, select = -c(d, b ) )
You can remove all columns between d
and b
with:
Data <- subset( Data, select = -c( d : b )
As I said above, this syntax works only when the column names are known. It won't work when say the column names are determined programmatically (i.e. assigned to a variable). I'll reproduce this Warning from the ?subset
documentation:
Warning:
This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like '[', and in particular the non-standard evaluation of argument 'subset' can have unanticipated consequences.