What's the difference between lapply and do.call?
There is a function called Map
that may be similar to map in other languages:
lapply
returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.do.call
constructs and executes a function call from a name or a function and a list of arguments to be passed to it.Map
applies a function to the corresponding elements of given vectors...Map
is a simple wrapper tomapply
which does not attempt to simplify the result, similar to Common Lisp's mapcar (with arguments being recycled, however). Future versions may allow some control of the result type.
-
Map
is a wrapper aroundmapply
-
lapply
is a special case ofmapply
- Therefore
Map
andlapply
will be similar in many cases.
For example, here is lapply
:
lapply(iris, class)
$Sepal.Length
[1] "numeric"
$Sepal.Width
[1] "numeric"
$Petal.Length
[1] "numeric"
$Petal.Width
[1] "numeric"
$Species
[1] "factor"
And the same using Map
:
Map(class, iris)
$Sepal.Length
[1] "numeric"
$Sepal.Width
[1] "numeric"
$Petal.Length
[1] "numeric"
$Petal.Width
[1] "numeric"
$Species
[1] "factor"
do.call
takes a function as input and splatters its other arguments to the function. It is widely used, for example, to assemble lists into simpler structures (often with rbind
or cbind
).
For example:
x <- lapply(iris, class)
do.call(c, x)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"numeric" "numeric" "numeric" "numeric" "factor"
lapply
applies a function over a list, do.call
calls a function with a list of arguments. That looks like quite a difference to me...
To give an example with a list :
X <- list(1:3,4:6,7:9)
With lapply you get the mean of every element in the list like this :
> lapply(X,mean)
[[1]]
[1] 2
[[2]]
[1] 5
[[3]]
[1] 8
do.call
gives an error, as mean expects the argument "trim" to be 1.
On the other hand, rbind
binds all arguments rowwise. So to bind X rowwise, you do :
> do.call(rbind,X)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
If you would use lapply
, R would apply rbind
to every element of the list, giving you this nonsense :
> lapply(X,rbind)
[[1]]
[,1] [,2] [,3]
[1,] 1 2 3
[[2]]
[,1] [,2] [,3]
[1,] 4 5 6
[[3]]
[,1] [,2] [,3]
[1,] 7 8 9
To have something like Map, you need ?mapply
, which is something different alltogether. TO get eg the mean of every element in X, but with a different trimming, you could use :
> mapply(mean,X,trim=c(0,0.5,0.1))
[1] 2 5 8
lapply
is similar to map
, do.call
is not. lapply
applies a function to all elements of a list, do.call
calls a function where all the function arguments are in a list. So for a n
element list, lapply
has n
function calls, and do.call
has just one function call. So do.call
is quite different from lapply
. Hope this clarifies your issue.
A code example:
do.call(sum, list(c(1, 2, 4, 1, 2), na.rm = TRUE))
and:
lapply(c(1, 2, 4, 1, 2), function(x) x + 1)
In most simple words:
lapply() applies a given function for each element in a list,so there will be several function calls.
do.call() applies a given function to the list as a whole,so there is only one function call.
The best way to learn is to play around with the function examples in the R documentation.