Parallel R



R, like Python and Perl, is by default single-threaded. So out-of-the-box, it cannot make use of multiple CPU cores on a given machine. Several R packages do exist to enable parallel processing in programming patterns that naturally lend themselves to parallelizaion. We'll discuss these below.

Frameworks Needed

R can make use of several frameworks to enable parallelization:

  • parallel is perhaps the most developed and most mature, and is based on the work from both multicore and snow package (the former of which has been remove from CRAN as it has been folded completely into parallel), the latter of which can be used for parallelization on one or across multiple machines. Implementation of this package is through several different approaches and has OS sticking points:
    • use of the snow base, through clusterX() or parXapply() functions: clusterApply(), parApply(), and many others.
    • use of the multicore base, enables parallelization through the apply() functions: mclapply(), mcmapply(), mcMap(). This, however, does not work on Windows machines due to how worker threads are generated (see Analogues of apply functions and Portability in the Rparallel PDF for more information).

  • foreach enables parallelization through the foreach extension of a for loop, the doMC package, and the %dopar% keyword. It can use either the multicore framework, with the same OS limitation mentioned above, or the snow framework.

Code Examples

The following code examples were adapted from the Texas Advanced Computing Center (TACC) seminar R for High Performance Computing given through XSEDE.

Below are a number of very simple examples to highlight how the frameworks can be included in your code. Nota Bene! The number of workers are dynamically determined by asking LSF (the scheduler) how many cores you have reserved. DO NOT use the mc.detectcores() routine or anything similar, as this will clobber your code as well as any other code running on the same compute node.

All the following examples will use the following example function :

myProc <- function(size=10000000) {
  #Load a large vector
  vec <- rnorm(size)

  #Now sum the vec values

Parallel + mclapply

The parallel package provides several methods for parallelizing your work. Perhaps the easiest to get started with is mclapply(). This function utilizes 'forking' to spin up additional R processes and is thus only available on Linux-like resources. mclapply() can only used for single-node parallelism.

# library(parallel): mclapply

result <- mclapply(1:10, function(i) myProc(), mc.cores=Sys.getenv('LSB_MAX_NUM_PROCESSORS')))

Parallel + snow

snow uses either MPI or socket-based connections to achieve single-node parallelism, and MPI to achieve parallelism across multiple compute nodes. MPI/remote node usage is not supported on the HBS compute grid. There is a bit more work involved with setting up the SNOW 'cluster'.

Most of the snow functionality was added to the parallel package. However, more fine grained control is achieved with using methods found in the snow package directly.

# library(parallel): snow single-node parallel cluster

# wraps the makeSOCKcluster() function and launches the specified number 
#   of R processes on the local machine
cluster <- makeCluster(Sys.getenv('LSB_MAX_NUM_PROCESSORS'))
# one must explicitly make vars/functions available in the sub-processes.
clusterExport(cluster, list('myProc'))

# now 
result <- clusterApply(cluster, 1:10, function(i) myProc())


Parallel + foreach

foreach is a package that makes parallelization easier, and uses other parallel functions under the hood, but hides some of the complexity. Again, since this uses the multicore framework, this will not work correctly on Widows OSes.

# library(parallel): foreach + multicore



result <- foreach(i=1:10, .combine=c) %dopar% {

Parallel + foreach + snow

Notice that foreach handles the clusterExport() for us:

# library(parallel): foreach + snow

cluster <- makeCluster(Sys.getenv('LSB_MAX_NUM_PROCESSORS'))

result <- foreach(i=1:10, .combine=c) %dopar% {


Scheduler Submission (Job) Script

If submitted via the terminal, the following batch submission script will submit your R code to the compute grid and will allocate 4 CPU cores for the work (as well as 5 GB of RAM for a run time limit of 12 hrs). If your code is written as above, using LSB_MAX_NUM_PROCESSORS, then your code will detect that 4 cores have been allocated. (Please note the second, long command may wrap on the page, but should be submitted as one line)

bsub -n 4 -q long -W 12:00 -R "rusage[mem=5000]" -M 5000 R CMD BATCH my_parallel_code.R

Updated on 1/15/19