How to convert a data frame column to numeric type?
How do you convert a data frame column to a numeric type?
Since (still) nobody got check-mark, I assume that you have some practical issue in mind, mostly because you haven't specified what type of vector you want to convert to numeric
. I suggest that you should apply transform
function in order to complete your task.
Now I'm about to demonstrate certain "conversion anomaly":
# create dummy data.frame
d <- data.frame(char = letters[1:5],
fake_char = as.character(1:5),
fac = factor(1:5),
char_fac = factor(letters[1:5]),
num = 1:5, stringsAsFactors = FALSE)
Let us have a glance at data.frame
> d
char fake_char fac char_fac num
1 a 1 1 a 1
2 b 2 2 b 2
3 c 3 3 c 3
4 d 4 4 d 4
5 e 5 5 e 5
and let us run:
> sapply(d, mode)
char fake_char fac char_fac num
"character" "character" "numeric" "numeric" "numeric"
> sapply(d, class)
char fake_char fac char_fac num
"character" "character" "factor" "factor" "integer"
Now you probably ask yourself "Where's an anomaly?" Well, I've bumped into quite peculiar things in R, and this is not the most confounding thing, but it can confuse you, especially if you read this before rolling into bed.
Here goes: first two columns are character
. I've deliberately called 2nd one fake_char
. Spot the similarity of this character
variable with one that Dirk created in his reply. It's actually a numerical
vector converted to character
. 3rd and 4th column are factor
, and the last one is "purely" numeric
.
If you utilize transform
function, you can convert the fake_char
into numeric
, but not the char
variable itself.
> transform(d, char = as.numeric(char))
char fake_char fac char_fac num
1 NA 1 1 a 1
2 NA 2 2 b 2
3 NA 3 3 c 3
4 NA 4 4 d 4
5 NA 5 5 e 5
Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion
but if you do same thing on fake_char
and char_fac
, you'll be lucky, and get away with no NA's:
> transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
char fake_char fac char_fac num
1 a 1 1 1 1
2 b 2 2 2 2
3 c 3 3 3 3
4 d 4 4 4 4
5 e 5 5 5 5
If you save transformed data.frame
and check for mode
and class
, you'll get:
> D <- transform(d, fake_char = as.numeric(fake_char),
char_fac = as.numeric(char_fac))
> sapply(D, mode)
char fake_char fac char_fac num
"character" "numeric" "numeric" "numeric" "numeric"
> sapply(D, class)
char fake_char fac char_fac num
"character" "numeric" "factor" "numeric" "integer"
So, the conclusion is: Yes, you can convert character
vector into a numeric
one, but only if it's elements are "convertible" to numeric
. If there's just one character
element in vector, you'll get error when trying to convert that vector to numerical
one.
And just to prove my point:
> err <- c(1, "b", 3, 4, "e")
> mode(err)
[1] "character"
> class(err)
[1] "character"
> char <- as.numeric(err)
Warning message:
NAs introduced by coercion
> char
[1] 1 NA 3 4 NA
And now, just for fun (or practice), try to guess the output of these commands:
> fac <- as.factor(err)
> fac
???
> num <- as.numeric(fac)
> num
???
Kind regards to Patrick Burns! =)
Something that has helped me: if you have ranges of variables to convert (or just more then one), you can use sapply
.
A bit nonsensical but just for example:
data(cars)
cars[, 1:2] <- sapply(cars[, 1:2], as.factor)
Say columns 3, 6-15 and 37 of you dataframe need to be converted to numeric one could:
dat[, c(3,6:15,37)] <- sapply(dat[, c(3,6:15,37)], as.numeric)
if x
is the column name of dataframe dat
, and x
is of type factor, use:
as.numeric(as.character(dat$x))