What is the difference between require() and library()?
What is the difference between require()
and library()
?
There's not much of one in everyday work.
However, according to the documentation for both functions (accessed by putting a ?
before the function name and hitting enter), require
is used inside functions, as it outputs a warning and continues if the package is not found, whereas library
will throw an error.
Another benefit of require()
is that it returns a logical value by default. TRUE
if the packages is loaded, FALSE
if it isn't.
> test <- library("abc")
Error in library("abc") : there is no package called 'abc'
> test
Error: object 'test' not found
> test <- require("abc")
Loading required package: abc
Warning message:
In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, :
there is no package called 'abc'
> test
[1] FALSE
So you can use require()
in constructions like the one below. Which mainly handy if you want to distribute your code to our R installation were packages might not be installed.
if(require("lme4")){
print("lme4 is loaded correctly")
} else {
print("trying to install lme4")
install.packages("lme4")
if(require(lme4)){
print("lme4 installed and loaded")
} else {
stop("could not install lme4")
}
}
In addition to the good advice already given, I would add this:
It is probably best to avoid using require()
unless you actually will be using the value it returns e.g in some error checking loop such as given by thierry.
In most other cases it is better to use library()
, because this will give an error message at package loading time if the package is not available. require()
will just fail without an error if the package is not there. This is the best time to find out if the package needs to be installed (or perhaps doesn't even exist because it it spelled wrong). Getting error feedback early and at the relevant time will avoid possible headaches with tracking down why later code fails when it attempts to use library routines
You can use require()
if you want to install packages if and only if necessary, such as:
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
For multiple packages you can use
for (package in c('<package1>', '<package2>')) {
if (!require(package, character.only=T, quietly=T)) {
install.packages(package)
library(package, character.only=T)
}
}
Pro tips:
-
When used inside the script, you can avoid a dialog screen by specifying the
repos
parameter ofinstall.packages()
, such asinstall.packages(package, repos="http://cran.us.r-project.org")
You can wrap
require()
andlibrary()
insuppressPackageStartupMessages()
to, well, suppress package startup messages, and also use the parametersrequire(..., quietly=T, warn.conflicts=F)
if needed to keep the installs quiet.
Always use library
. Never use require
.
tl;dr: require
breaks one of the fundamental rules of robust software systems: fail early.
In a nutshell, this is because, when using require
, your code might yield different, erroneous results, without signalling an error. This is rare but not hypothetical! Consider this code, which yields different results depending on whether {dplyr} can be loaded:
require(dplyr)
x = data.frame(y = seq(100))
y = 1
filter(x, y == 1)
This can lead to subtly wrong results. Using library
instead of require
throws an error here, signalling clearly that something is wrong. This is good.
It also makes debugging all other failures more difficult: If you require
a package at the start of your script and use its exports in line 500, you’ll get an error message “object ‘foo’ not found” in line 500, rather than an error “there is no package called ‘bla’”.
The only acceptable use case of require
is when its return value is immediately checked, as some of the other answers show. This is a fairly common pattern but even in these cases it is better (and recommended, see below) to instead separate the existence check and the loading of the package. That is: use requireNamespace
instead of require
in these cases.
More technically, require
actually calls library
internally (if the package wasn’t already attached — require
thus performs a redundant check, because library
also checks whether the package was already loaded). Here’s a simplified implementation of require
to illustrate what it does:
require = function (package) {
already_attached = paste('package:', package) %in% search()
if (already_attached) return(TRUE)
maybe_error = try(library(package, character.only = TRUE))
success = ! inherits(maybe_error, 'try-error')
if (! success) cat("Failed")
success
}
Experienced R developers agree:
Yihui Xie, author of {knitr}, {bookdown} and many other packages says:
Ladies and gentlemen, I've said this before: require() is the wrong way to load an R package; use library() instead
Hadley Wickham, author of more popular R packages than anybody else, says
Use
library(x)
in data analysis scripts. […] You never need to userequire()
(requireNamespace()
is almost always better)