Creating a new variable from a lookup table

Here is how to use a named vector for the lookup:

Define test data:

dat <- data.frame(
    presult = c(rep("I", 4), "SS", "ZZ"),
    aresult = c("single", "double", "triple", "home run", "strikeout", "home run"),
    stringsAsFactors=FALSE
)

Define a named numeric vector with the scores:

score <- c(single=1, double=2, triple=3, `home run`=4,  strikeout=0)

Use vector indexing to match the scores against results:

dat$base <- score[dat$aresult]
dat
  presult   aresult base
1       I    single    1
2       I    double    2
3       I    triple    3
4       I  home run    4
5      SS strikeout    0
6      ZZ  home run    4

Additional information:

If you don't wish to construct the named vector by hand, say in the case where you have large amounts of data, then do it as follows:

scores <- c(1:4, 5)
names(scores) <- c("single", "double", "triple", "home run", "strikeout")

(Or read the values and names from existing data. The point is to construct a numeric vector and then assign names.)


define your lookup table

lookup= data.frame( 
        base=c(0,1,2,3,4), 
        aresult=c("strikeout","single","double","triple","home run"))

then use join from plyr

dataset = join(dataset,lookup,by='aresult')

An alternative to Dieter's answer:

dat <- data.frame(
  presult = c(rep("I", 4), "SS", "ZZ"),
  aresult = c("single", "double", "triple", "home run", "strikeout", "home run"),
  stringsAsFactors=FALSE
)

dat$base <- as.integer(factor(dat$aresult,
  levels=c("strikeout","single","double","triple","home run")))-1

 dataset$base <- as.integer(as.factor(dataset$aresult))

Depending on your data as.factor() could be omitted, because in many cases strings are factor by default, e.g. with read.table