Get the column number in R given the column name [duplicate]
Possible Duplicate:
Get column index from label in a data frame
I need to get the column number of a column given its name.
Supose we have the following dataframe:
df <- data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100))
I need a function that would work like the following:
getColumnNumber(df,"b")
And it would return
[1] 2
Is there a function like that?
Thanks!
which( colnames(df)=="b" )
Should do it.
One fast and neat method is :
> match("b",names(df))
[1] 2
That avoids the vector scan that ==
and which
do. If you have a lot of columns, and you do this a lot, then you might like the fastmatch package.
> require(fastmatch)
> fmatch("b",names(df))
[1] 2
fmatch
is faster than match
, but on subsequent calls it's not just faster, it's instant.
Another method which generalizes better to non-exact matching tasks is to use grep
which returns a vector of numbers for matches with patterns within character vectors :
grep("^b$", colnames(df) )
If you wanted to remove by position number all of the columns whose names begin with "b", you would write:
df[ , - grep("^b", colnames(df) )]
That neatly finesses the issue that you cannot use negative indexing with character vectors.
..especially, if you need to get several column indices the below approach applies:
> df <- data.frame(a=rnorm(100),b=rnorm(100),c=rnorm(100))
> which(names(df)%in%c("b", "c"))
[1] 2 3
if you use this for subsetting df you don't need which()
> df_sub <- df[, names(df)%in%c("b", "c")]
> head(df_sub)
b c
1 0.1712754 0.3119079
2 -1.3656995 0.7111664
3 -0.2176488 0.7714348
4 -0.6599826 -0.3528118
5 0.4510227 -1.6438053
6 0.2451216 2.5305453