LSF Queues

Note: There are some change to this document for Grid 2.5. Please see below and in-context. Please contact RCS with any questions. 

When submitting jobs to the scheduler to have work performed, one must select the set of compute nodes that are appropriate for the type of work to be done. Queues group sets of computers with related or specific characteristics, as well as additional attributes, such as time or RAM limits, for example. The HBS Grid has the following queue options:

interactive

This queue is dedicated for interactive (foreground / live) work and for testing (interactively) code before submitting in batch or scaling. Small numbers (1 to 5) of serial and parallel jobs with small resource requirements (RAM/cores) are permitted on this queue, and are subject to any system-wide resource limits.

normal

This queue is general purpose queue for all background, batch work and scaled jobs. Jobs submitted here are only limited by the available resources on the system and any system-wide resource limits.

sas_interactive

This queue is dedicated for interactive and code testing work for SAS, before submitting in batch or scaling. Small numbers (1 to 5) of serial and parallel jobs with small resource requirements (RAM/cores) are permitted on this queue, and are subject to any system-wide resource limits.

sas_normal

This queue is general purpose queue for all SAS background, batch work and scaled jobs. Jobs submitted here are only limited by the available resources on the dedicated SAS nodes and any system-wide resource limits.

 

Grid 2.5 changes

Since the per-user resource limits have been lifted, unlimited run times for interactive sessions and batch jobs are no longer permitted. This table gives a summary of the changes: 

Queue Type Length Max Cores/Job
long_int interactive 3 days 4
short_int interactive 1 day 12
sas_interactive interactive no length 4
long batch 7 days 12
short batch 3 days 16
sas_normal batch no length 4
unlimited batch no length 4 (for now)

Details on the queues are as follows:

Interactive queues: Interactive queues are divided into long and short run lengths, based on the number of cores requested per job. Additionally, since we wish to ensure that all persons should be able to get at least one interactive session, there is a maximum of 24 cores allowed over a max of 3 sessions; more than 12 cores for a given job are not permitted ("interactive sessions limit").

long_int

This queue is dedicated for interactive (foreground / live) work, for testing (interactively) code before submitting in batch or scaling, or for exploratory work. Serial and parallel jobs using 1 to 4 cores are permitted in this queue, can run a maximum of 3 days, and are subject to the interactive sessions limit.

short_int

This queue is also dedicated for interactive (foreground / live) work, for testing (interactively) code before submitting in batch or scaling, or for exploratory work. Parallel jobs using 5 to 12 cores -- actually any core count up to 12 -- are permitted in this queue, can run a maximum of 1 day, and are subject to the interactive sessions limit.

sas_interactive

This queue is dedicated for interactive and code testing work for SAS, before submitting in batch or scaling. Serial and parallel jobs using 1 to 4 cores with small resource requirements (RAM/cores) are permitted on this queue, and can run for an unlimited length of time.

Batch queues: Batch queues, like the interactive queues, are also divided into long and short run lengths, based on the number of cores requested per job. Jobs are limited only by the available resources on the batch compute nodes, and the scheduler may limit dispatching jobs to run based on your Fairshare score -- a priority score that might limit your work the more you compute, in order to allow others to run theirs as well. Jobs cannot exceed a maximum of 16 cores.

long

This queue is general purpose queue for all background, batch work and scaled jobs. Serial and parallel jobs using 1 to 12 cores are permitted in this queue, can run a maximum of 7 days, are only limited by the available resources on the system and your Fairshare priority score.

short

This queue is general purpose queue for all background, batch work and scaled jobs. Serial and parallel jobs using 13 to 16 cores -- actually any core count up to 16 -- are permitted in this queue, can run a maximum of 3 days, and are only limited by the available resources on the system and your Fairshare priority score.

sas_normal

This queue is a general purpose queue for all SAS background, batch work and scaled jobs. Jobs submitted here can use 1 to 4 cores, are only limited by the available resources on the dedicated SAS nodes, and can run for an unlimited length of time.

unlimited

This queue is for single or parallel work up to 4 cores per job with no run time limit (Note: the max cores/job may increase later during the beta-testing period). Since there are very few number of compute nodes for this type of work, your job will schedule when room is available and the previous jobs have finished. We highly recommend that you do not use this queue if possible.

Updated 9/10/18