What's the biggest R-gotcha you've run across?

Is there a certain R-gotcha that had you really surprised one day? I think we'd all gain from sharing these.

Here's mine: in list indexing, my.list[[1]] is not my.list[1]. Learned this in the early days of R.


Solution 1:

[Hadley pointed this out in a comment.]

When using a sequence as an index for iteration, it's better to use the seq_along() function rather than something like 1:length(x).

Here I create a vector and both approaches return the same thing:

> x <- 1:10
> 1:length(x)
 [1]  1  2  3  4  5  6  7  8  9 10
> seq_along(x)
 [1]  1  2  3  4  5  6  7  8  9 10

Now make the vector NULL:

> x <- NULL
> seq_along(x) # returns an empty integer; good behavior
integer(0)
> 1:length(x) # wraps around and returns a sequence; this is bad
[1] 1 0

This can cause some confusion in a loop:

> for(i in 1:length(x)) print(i)
[1] 1
[1] 0
> for(i in seq_along(x)) print(i)
>

Solution 2:

The automatic creation of factors when you load data. You unthinkingly treat a column in a data frame as characters, and this works well until you do something like trying to change a value to one that isn't a level. This will generate a warning but leave your data frame with NA's in it ...

When something goes unexpectedly wrong in your R script, check that factors aren't to blame.