Global and local variables in R
Variables declared inside a function are local to that function. For instance:
foo <- function() {
bar <- 1
}
foo()
bar
gives the following error: Error: object 'bar' not found
.
If you want to make bar
a global variable, you should do:
foo <- function() {
bar <<- 1
}
foo()
bar
In this case bar
is accessible from outside the function.
However, unlike C, C++ or many other languages, brackets do not determine the scope of variables. For instance, in the following code snippet:
if (x > 10) {
y <- 0
}
else {
y <- 1
}
y
remains accessible after the if-else
statement.
As you well say, you can also create nested environments. You can have a look at these two links for understanding how to use them:
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/environment.html
- http://stat.ethz.ch/R-manual/R-devel/library/base/html/get.html
Here you have a small example:
test.env <- new.env()
assign('var', 100, envir=test.env)
# or simply
test.env$var <- 100
get('var') # var cannot be found since it is not defined in this environment
get('var', envir=test.env) # now it can be found
<-
does assignment in the current environment.
When you're inside a function R creates a new environment for you. By default it includes everything from the environment in which it was created so you can use those variables as well but anything new you create will not get written to the global environment.
In most cases <<-
will assign to variables already in the global environment or create a variable in the global environment even if you're inside a function. However, it isn't quite as straightforward as that. What it does is checks the parent environment for a variable with the name of interest. If it doesn't find it in your parent environment it goes to the parent of the parent environment (at the time the function was created) and looks there. It continues upward to the global environment and if it isn't found in the global environment it will assign the variable in the global environment.
This might illustrate what is going on.
bar <- "global"
foo <- function(){
bar <- "in foo"
baz <- function(){
bar <- "in baz - before <<-"
bar <<- "in baz - after <<-"
print(bar)
}
print(bar)
baz()
print(bar)
}
> bar
[1] "global"
> foo()
[1] "in foo"
[1] "in baz - before <<-"
[1] "in baz - after <<-"
> bar
[1] "global"
The first time we print bar we haven't called foo
yet so it should still be global - this makes sense. The second time we print it's inside of foo
before calling baz
so the value "in foo" makes sense. The following is where we see what <<-
is actually doing. The next value printed is "in baz - before <<-" even though the print statement comes after the <<-
. This is because <<-
doesn't look in the current environment (unless you're in the global environment in which case <<-
acts like <-
). So inside of baz
the value of bar stays as "in baz - before <<-". Once we call baz
the copy of bar inside of foo
gets changed to "in baz" but as we can see the global bar
is unchanged. This is because the copy of bar
that is defined inside of foo
is in the parent environment when we created baz
so this is the first copy of bar
that <<-
sees and thus the copy it assigns to. So <<-
isn't just directly assigning to the global environment.
<<-
is tricky and I wouldn't recommend using it if you can avoid it. If you really want to assign to the global environment you can use the assign function and tell it explicitly that you want to assign globally.
Now I change the <<-
to an assign statement and we can see what effect that has:
bar <- "global"
foo <- function(){
bar <- "in foo"
baz <- function(){
assign("bar", "in baz", envir = .GlobalEnv)
}
print(bar)
baz()
print(bar)
}
bar
#[1] "global"
foo()
#[1] "in foo"
#[1] "in foo"
bar
#[1] "in baz"
So both times we print bar inside of foo
the value is "in foo" even after calling baz
. This is because assign
never even considered the copy of bar
inside of foo because we told it exactly where to look. However, this time the value of bar in the global environment was changed because we explicitly assigned there.
Now you also asked about creating local variables and you can do that fairly easily as well without creating a function... We just need to use the local
function.
bar <- "global"
# local will create a new environment for us to play in
local({
bar <- "local"
print(bar)
})
#[1] "local"
bar
#[1] "global"
A bit more along the same lines
attrs <- {}
attrs.a <- 1
f <- function(d) {
attrs.a <- d
}
f(20)
print(attrs.a)
will print "1"
attrs <- {}
attrs.a <- 1
f <- function(d) {
attrs.a <<- d
}
f(20)
print(attrs.a)
Will print "20"