Note: There are some change to this document for Grid 2.5. Please see below and in-context. Please contact RCS with any questions. Sections:
This page is intended to help you with running parallel python codes on the HBS compute grid or on your local multicore machine. The package to be highlighted is the 'multiprocessing' package. This page will NOT cover distributed computing, which distributes the workload over multiple machines.
Maximum Workers: Each compute node on the Grid has 32 physical cores; therefore (in theory) users should request no more than 32 cores. However, due to current user resource limits, you should request no more than 12 cores. If you request more than 12 cores, your job will not run as it will sit in a
(Grid 2.5: For short queue jobs, you may request the use of up to 16 cores, while the limit remains at 12 cores for long queue jobs.)
This sample code will provide a basic introduction to parallel processing. You will be shown how to set up your parallel pool with the appropriate number of workers, how to define which function is to be run in parallel, and how to gather the results.
For this example, we will calculate the square of a list of numbers in parallel.
if __name__ == "__main__":
procs = [multiprocessing.Process(target=f, args=(x,)) for x in numList]
for p in procs:
if __name__ == '__main__':
num_workers = multiprocessing.cpu_count()-1
p = multiprocessing.Pool(num_workers)
result = p.map(f,numList)
Code with Job Submission Script
To run the above code (named test.py) using 5 CPU cores with the Grid's default wrapper scripts, in the terminal use the following command:
python -n 5 test.py
Grid 2.5: Note that since the normal queue has been split, in the above two examples you will need to use "short" or "long" instead of "normal." Therefore for Grid 2.5 those two examples should look like the following:
bsub -q long -N -n 5 -W 10 -R ”rusage[mem=100]” -M 100 python -r "run('python.py');"
bsub -q long -N -n 5 -W 10 -R ”rusage[mem=100]” -M 100 python \< test.py
If you wish to use a submission script to run this code and include LSF job option parameters, create a text file named
code.sh containing the following:
#BSUB -q normal
#BSUB -W 10
#BSUB -R" rusage[mem=100]"
#BSUB -W 100
python -r "run('test.py');"
(Grid 2.5: Note that since the normal queue has been split, in the above example you will need to use "short" or "long" instead of "normal" dependent upon the number of cores you will be requesting.)
Once your script is ready, you may run it with 5 cores by entering:
bsub -n 5 < ./code.sh
< character is used here so that the
#BSUB directives in the script file are parsed by LSF.
Please see Submitting Batch Jobs for more information.
Updated on 8/9/18