the difference between doMC and doParallel in R
The doParallel
package is a merger of doSNOW
and doMC
, much as parallel
is a merger of snow
and multicore
. But although doParallel
has all the features of doMC
, I was told by Rich Calaway of Revolution Analytics that they wanted to keep doMC
around because it was more efficient in certain circumstances, even though doMC
now uses parallel
just like doParallel
. I haven't personally run any benchmarks to determine if and when there is a significant difference.
I tend to use doMC
on a Linux or Mac OS X computer, doParallel
on a Windows computer, and doMPI
on a Linux cluster, but doParallel
does work on all of those platforms.
As for the different registration methods, if you execute:
registerDoParallel(cores=3)
on a Windows machine, it will create a cluster object implicitly for later use with clusterApplyLB
, whereas on Linux and Mac OS X, no cluster object is created or used. The number of cores is simply remembered and used as the value of the mc.cores
argument later when calling mclapply
.
If you execute:
cl <- makeCluster(3)
registerDoParallel(cl)
then the registered cluster object will be used with clusterApplyLB
regardless of the platform. You are correct that in this case, it is your responsibility to shutdown the cluster object since you created it, whereas the implicit cluster object is automatically shutdown.