Count values separated by a comma in a character string
I have this example data
d<-"30,3"
class(d)
I have this character objects in one column in my work data frame and I need to be able to identify how many numbers it has.
I have tried to use length(d)
, but it says 1
After looking for solution here I have tried
eval(parse(text='d'))
as.numeric(d)
as.vector.character(d)
But it still doesn't work.
Any straightforward approach to solve this problem?
These two approaches are each short, work on vectors of strings, do not involve the expense of explicitly constructing the split string and do not use any packages. Here d
is a vector of strings such as d <- c("1,2,3", "5,2")
:
1) count.fields
count.fields(textConnection(d), sep = ",")
2) gregexpr
lengths(gregexpr(",", d)) + 1
You could use scan
.
v1 <- scan(text=d, sep=',', what=numeric(), quiet=TRUE)
v1
#[1] 30 3
Or using stri_split
from stringi
. This should take both character
and factor
class without converting explicitly to character using as.character
library(stringi)
v2 <- as.numeric(unlist(stri_split(d,fixed=',')))
v2
#[1] 30 3
You can do the count
using base R
by
length(v1)
#[1] 2
Or
nchar(gsub('[^,]', '', d))+1
#[1] 2
Visualize the regex
[^,]
Debuggex Demo
Update
If d
is a column in a dataset df
and want to subset rows with number of digits equals 2
d<-c("30,3,5","30,5")
df <- data.frame(d,stringsAsFactors=FALSE)
df[nchar(gsub('[^,]', '',df$d))+1==2,,drop=FALSE]
# d
#2 30,5
Just to test
df[nchar(gsub('[^,]', '',df$d))+1==10,,drop=FALSE]
#[1] d
#<0 rows> (or 0-length row.names)