Parallel MATLAB


The following has been adapted from FAS RC’s Parallel MATLAB page ( As the Odyssey cluster uses a different workload manager, the code has been adapted to the workload manager on the HBS compute grid.

This page is intended to help you with running parallel MATLAB codes on the HBS compute grid. Parallel processing with MATLAB is performed with the help of two products, Parallel Computing Toolbox (PCT) and Distributed Computing Server (DCS). HBS is licensed only for use of the PCT.

Supported Versions: On the HBS compute grid, the following versions of MATLAB with the Parallel Computing Toolbox (PCT) are:

MATLAB Version Executable name
MATALB 2018a 64-bit matlab

For more information about running older versions of MATLAB, please visit the Grid website on running older versions of software.

Maximum Workers: PCT uses workers, MATLAB computational engines, to execute parallelized applications and their parts on CPU cores. Each compute node on the Grid has 32 physical cores; therefore (in theory) users should request no more than 32 cores when using MATLAB with PCT. However, due to current user resource limits, you should request no more than 12 (interactive) or 16 (batch) cores. If you request more than this, your job will not run as it will sit in a PEND state.

Code Example

The following simple code illustrates the use of PCT to calculate pi via a parallel Monte-Carlo method. This example also illustrates the use of parfor (parallel for) loops. In this scheme, suitable for loops could be simply replaced by parallel parfor loops without other changes to the code:


hLog = fopen( [mfilename, '.log'] , 'w' ); % Create log file

% Launch parallel pool with as many workers as requested
hPar = parpool( 'local' , str2num( getenv('LSB_MAX_NUM_PROCESSORS') ) );

% Report number of workers
fprintf( hLog , 'Number of workers = %d\n' , hPar.NumWorkers )

% Main code
R = 1; darts = 1e7; count = 0; % Prepare settings
tic; % Start timer

parfor i = 1:darts
    x = R * rand(1);
    y = R * rand(1);
    if x^2 + y^2 <= R^2
        count = count + 1

myPI = 4 * count / darts;
T = toc; % Stop timer

% Log results
fprintf( hLog , 'The computed value of pi is %2.7f\n' , myPI );
fprintf( hLog , 'Executed in %8.2f seconds\n' , T );

% shutdown pool, close log file, and exit


Code with Job Submission Script

To run the above code (named code.m) using 5 CPU cores with the Grid's default wrapper scripts, in the terminal use the following command:

matlab -n5 code.m

Using custom job submission commands, the following will similarly submit the job with 5 cores for 10 mins with 100 MB of memory:

bsub -q short -N -n 5 -W 10 -R "rusage[mem=100]" -M 100 -hl matlab -r "run('code.m');"

This will cause a log file to be created called code.log owing to the first line in our MATLAB code, hLog=fopen( [mfilename, '.log'] , 'w' );

If you do not use MATLAB's mfilename function, then you may also enter the following command to have output sent to an unnamed output file:

bsub -q short -N -n 5 -W 10 -R ”rusage[mem=100]” -M -hl 100 matlab \< code.m

The < is escaped here so that it becomes part of the MATLAB command, not the bsub command.

If you wish to use a submission script to run this code and include LSF job option parameters, create a text file named containing the following:

#BSUB -q short
#BSUB -W 10
#BSUB -R" rusage[mem=100]"
#BSUB -W 100
matlab -r "run('code.m');"

Once your script is ready, you may run it with 5 cores by entering:

bsub -n 5 < ./

The < character is used here so that the #BSUB directives in the script file are parsed by LSF.

Please see Submitting Batch Jobs for more information.

Explanation of Parallel Code

Starting and stopping the parallel pool

The parpool function is used to initiate the parallel pool. To dynamically set the number of workers to the CPU cores you requested, we ask MATLAB to query the LSF environment variable LSB_MAX_NUM_PROCESSORS:

hPar = parpool( 'local', str2num( getenv( 'LSB_MAX_NUM_PROCESSORS' ) ) );

Once the parallelized portion of your code has been run, you should explicitly close the parallel pool and release the workers as follows:

delete(gcp); % Shutdown parallel pool

Parallelized portion of the code

The actual portion of the code that takes advantage of multiple CPUs is the parfor loop ( A parfor loop behaves similarly to a for loop, though various iterations of the loop are passed to different workers. It is therefore important that iterations due not rely on the output of any other iteration in the same loop.

parfor i = 1:darts
  x = R * rand(1);
  y = R * rand(1);
  if x^2 + y^2 <= R^2
    count = count + 1


Updated on 1/15/19