Submitting Batch Jobs

The main way to run jobs on the Grid is by submitting a script with the bsub command. The following examples highlight three different ways to submit your jobs:

Example 1: The script runscript.sh is run on the first available compute node that fits the resources requested on the command line:

bsub -q short -W 6:00 -R "rusage[mem=4000]" -M 4000 -hl runscript.sh

In this example, runscript.sh can contain a list of commands; or if only one command, this command can be listed at the end of the bsub line instead:

bsub -q short -W 6:00 -R "rusage[mem=4000]" -M 4000 -hl cp myfile /to/some/path

Example 2: The commands specified in the runscript.sh file will be run on the first available compute node that fits the job resource requirements listed in the #BSUB directives in the script.

bsub < runscript.sh

Example 3: The commands specified in the runscript.sh file will then be run on the first available compute node that fits the resource requirements listed on the command line, which supersedes any #BSUB directives inside the script file.

bsub -q short -W 6:00 -R "rusage[mem=4000]" -M 4000 -hl < runscript.sh

 

A typical submission script, in this case using the hostname command to get the computer name, will look like:

#!/bin/bash
#
#BSUB -n 1                    # Number of cores
#BSUB -W 05                   # Runtime in [[D:]HH:]MM
#BSUB -q short               # Queue to submit to 
#BSUB -R "rusage[mem=4000]"   # Memory reservation for the job
#BSUB -M 4000 -hl                 # Upper memory limit 
#BSUB -o hostname_%J.out      # File to which STDOUT will be written 
#BSUB -e hostname_%J.err      # File to which STDERR will be written 
#BSUB -B -N                   # Send email when job begins & ends/fails 
#BSUB -u myemail@what.com     # NOTE! guest users you would need to use this option

hostname

 

In general, a script is composed of three parts.

  • The #!/bin/bash line allows the script to be run as a bash shell script
  • The #BSUB lines are technically (bash shell) comments, but they set various parameters for the LSF scheduler
  • One or more commands to be executed

The #BSUB lines shown above set key parameters (Note: It is important to keep all #BSUB lines together and at the top of the script; no bash code or variables settings should be done until after the #BSUB lines). The LSF system copies many environment variables from your current session to the compute host where the script is run including PATH and your current working directory. As a result, you can specify files relative to your current location (e.g. ./project/myfiles/myfile.txt).

#BSUB -n 1

This line sets the number of cores that you're requesting. Make sure that your tool can use multiple cores before requesting more than one. If this parameter is omitted, LSF assumes -n 1.

#BSUB -W 05

This line specifies the running time for the job in minutes. You can also the convenient format [[D:]HH:]MM. If your job runs longer than the value you specify here, it will be cancelled.  To ensure fair usage of the grid there is a time limit on run time. The time limit for short batch jobs (bsub -q short) is 3 days. The time limit for long batch jobs (bsub -q long) is 7 days.  Thus it is in your best interest to specify the time as a routine habit. There is no penalty for over-requesting time as long as it is within the limits specified above. 

#BSUB -q short or long

This line specifies the LSF queue under which the script or command will be run. The short or long partition is good for routine, non-SAS jobs that can take advantage of all parts of the Grid. Please use -q short for jobs that run within 3 days and -q long for jobs that run more than 3 days but within 7 days.  

#BSUB -R "rusage[mem=4000]"
#BSUB -M 4000 -hl

The HBS LSF cluster does not require that you specify the amount of memory (in MB) that you will be using for your job. If this parameter is omitted, the smallest amount is allocated, usually 100 MB. And chances are good that your job will be killed as it will likely go over this amount. Moreover, accurate specifications allow jobs to be run with maximum efficiency on the system.

#BSUB -o hostname_%J.out

This line specifies the file to which standard out will be appended. If a relative file name is used, it will be relative to your current working directory. The %Jin the filename will be substituted by the jobID at runtime. If this parameter is omitted, any output will be directed to the email that is sent out when the job finishes.

#BSUB -e hostname_%J.err

This line specifies the file to which standard error will be appended. LSF submission and processing errors will also appear in the file. The %J in the filename will be substituted by the jobID at runtime. If this parameter is omitted, any output will be directed to the same location as the -o parameter, which will be either a file or in an email.

#BSUB -B -N

Because jobs are processed in the "background" and can take some time to run, it is useful send an email message when the job has started or finished.  Please NOTE that for guest users you would also need to use the -u option

 

Last Updated 1/15/2019