# Parallel Python

Sections:

### Introduction

This page is intended to help you with running parallel python codes on the HBS compute grid or on your local multicore machine. The version of python on the Grid uses MKL to automatically parallelize some computations. Python started on the Grid via a wrapper script will use the number of CPUs you specify when starting your application. For example, starting Oython via python3 -n 5 from the command line will start Python with MKL correctly configured to use 5 cores. You can configure the number of cores MKL uses with the mkl-service package.

In addition to the implicit parallelization provided by MKL, you can explicitly parallelize your own analysis code using the 'multiprocessing' package. Instructions and examples are provided below. Note that this guild does NOT cover distributed computing, which distributes the workload over multiple machines.

Maximum Workers: Each compute node on the Grid has 32 physical cores; therefore (in theory) users should request no more than 32 cores. For short queue jobs, you may request the use of up to 16 cores, while the limit remains at 12 cores for long queue jobs. Nota Bene! The number of workers are dynamically determined by asking LSF (the scheduler) how many cores you have reserved via the LSB_MAX_NUM_PROCESSORS environment variable. DO NOT use multiprocessing.cpu_count() or similar; instead retrieve the values of this environment variable, e.g. os.getenv(LSB_MAX_NUM_PROCESSORS).

Example: Parallel Processing Basics

This sample code will provide a basic introduction to parallel processing. You will be shown how to set up your parallel pool with the appropriate number of workers, how to define which function is to be run in parallel, and how to gather the results.

For this example, we will calculate the square of a list of numbers in parallel.

____________________________________________________________import sysimport osimport multiprocessingimport time

def f(x):    pid=os.getpid()    print("{}:{}".format(pid,x*x))    return x*x

if __name__ == "__main__":        numList=range(1,100)              procs = [multiprocessing.Process(target=f, args=(x,)) for x in numList]

        for p in procs:               p.start()               p.join()

Outputs:

3728:110236:411508:98348:163012:2513244:362528:498440:645184:8111168:1006848:12113292:144........

### Example: Parallel Processing with Pools

if __name__ == '__main__':

      num_workers = os.getenv("LSB_MAX_NUM_PROCESSORS")
      numList=range(1,100)

    p = multiprocessing.Pool(num_workers)    result = p.map(f,numList)    p.close()    p.join()

Outputs:

14452:114452:414452:914452:1614452:2514452:36......14452:230414452:24012940:28092940:291614452:25002940:30252940:313614452:26016452:32492940:37212940:3844

____________________________________________________________

## Code with Job Submission Script

To run the above code (named test.py) using 5 CPU cores with the Grid's default wrapper scripts, in the terminal use the following command:

python -n 5 test.py

Submit the two examples code files above as follows:

bsub -q long -N -n 5 -W 10 -R ”rusage[mem=1024]” -M 1024 python python.py

bsub -q long -N -n 5 -W 10 -R ”rusage[mem=1024]” -M 1024 python \< test.py

If you wish to use a submission script to run this code and include LSF job option parameters, create a text file named code.sh containing the following:

____________________________________________________________ #!/bin/bash##BSUB -q long#BSUB -N#BSUB -W 10#BSUB -R" rusage[mem=1024]"#BSUB -W 1024

python test.py____________________________________________________________

Note that since the normal queue has been split, in the above example you will need to use "short" or "long."

Once your script is ready, you may run it with 5 cores by entering:

bsub -n 5 < ./code.sh

The < character is used here so that the #BSUB directives in the script file are parsed by LSF.

Updated on 8/9/18