Why is using `<<-` frowned upon and how can I avoid it?

I followed the discussion over HERE and am curious why is using<<- frowned upon in R. What kind of confusion will it cause?

I also would like some tips on how I can avoid <<-. I use the following quite often. For example:

### Create dummy data frame of 10 x 10 integer matrix.
### Each cell contains a number that is between 1 to 6.
df <- do.call("rbind", lapply(1:10, function(i) sample(1:6, 10, replace = TRUE)))

What I want to achieve is to shift every number down by 1, i.e all the 2s will become 1s, all the 3s will be come 2 etc. Therefore, all n would be come n-1. I achieve this by the following:

df.rescaled <- df
sapply(2:6, function(i) df.rescaled[df.rescaled == i] <<- i-1))

In this instance, how can I avoid <<-? Ideally I would want to be able to pipe the sapply results into another variable along the lines of:

df.rescaled <- sapply(...)

Solution 1:

First point

<<- is NOT the operator to assign to global variable. It tries to assign the variable in the nearest parent environment. So, say, this will make confusion:

f <- function() {
    a <- 2
    g <- function() {
        a <<- 3
    }
}

then,

> a <- 1
> f()
> a # the global `a` is not affected
[1] 1

Second point

You can do that by using Reduce:

Reduce(function(a, b) {a[a==b] <- a[a==b]-1; a}, 2:6, df)

or apply

apply(df, c(1, 2), function(i) if(i >= 2) {i-1} else {i})

But

simply, this is sufficient:

ifelse(df >= 2, df-1, df)

Solution 2:

You can think of <<- as global assignment (approximately, because as kohske points out it assigns to the top environment unless the variable name exists in a more proximal environment). Examples of why this is bad are here:

Examples of the perils of globals in R and Stata