Generate a dummy-variable

I have had trouble generating the following dummy-variables in R:

I'm analyzing yearly time series data (time period 1948-2009). I have two questions:

  1. How do I generate a dummy variable for observation #10, i.e. for year 1957 (value = 1 at 1957 and zero otherwise)?

  2. How do I generate a dummy variable which is zero before 1957 and takes the value 1 from 1957 and onwards to 2009?


Solution 1:

Another option that can work better if you have many variables is factor and model.matrix.

year.f = factor(year)
dummies = model.matrix(~year.f)

This will include an intercept column (all ones) and one column for each of the years in your data set except one, which will be the "default" or intercept value.

You can change how the "default" is chosen by messing with contrasts.arg in model.matrix.

Also, if you want to omit the intercept, you can just drop the first column or add +0 to the end of the formula.

Hope this is useful.

Solution 2:

The simplest way to produce these dummy variables is something like the following:

> print(year)
[1] 1956 1957 1957 1958 1958 1959
> dummy <- as.numeric(year == 1957)
> print(dummy)
[1] 0 1 1 0 0 0
> dummy2 <- as.numeric(year >= 1957)
> print(dummy2)
[1] 0 1 1 1 1 1

More generally, you can use ifelse to choose between two values depending on a condition. So if instead of a 0-1 dummy variable, for some reason you wanted to use, say, 4 and 7, you could use ifelse(year == 1957, 4, 7).