Count word occurrences in R

Is there a function for counting the number of times a particular keyword is contained in a dataset?

For example, if dataset <- c("corn", "cornmeal", "corn on the cob", "meal") the count would be 3.


Solution 1:

Let's for the moment assume you wanted the number of element containing "corn":

length(grep("corn", dataset))
[1] 3

After you get the basics of R down better you may want to look at the "tm" package.

EDIT: I realize that this time around you wanted any-"corn" but in the future you might want to get word-"corn". Over on r-help Bill Dunlap pointed out a more compact grep pattern for gathering whole words:

grep("\\<corn\\>", dataset)

Solution 2:

Another quite convenient and intuitive way to do it is to use the str_count function of the stringr package:

library(stringr)
dataset <- c("corn", "cornmeal", "corn on the cob", "meal")

# for mere occurences of the pattern:
str_count(dataset, "corn")
# [1] 1 1 1 0

# for occurences of the word alone:
str_count(dataset, "\\bcorn\\b")
# [1] 1 0 1 0

# summing it up
sum(str_count(dataset, "corn"))
# [1] 3

Solution 3:

You can also do something like the following:

length(dataset[which(dataset=="corn")])