After choosing resources, picking the software application and environment is the next, important step in doing work on the HBSGrid cluster. This is true for both real-time, interactive work in applications or via batch (background) jobs.
Because of the diversity of projects currently in flight on the HBSGrid cluster, and because the cluster is not a single computer on which you install software directly, a variety of applications and libraries are supported on the cluster. Technically, it is impossible to include everything at once in every user’s environment. For this reason, we now offer https://www.tacc.utexas.edu/research-development/tacc-projects/lmod to selectively expose, or make available, an application and all its supporting binaries, while also ensuring that incompatible programs are not also in the mix.
The primer below will help you use them for submitting jobs; full documentation is on our Software Modules page.
For the time being, using software modules is opt-in. Use the command touch ~/.lmod-yes in the terminal when connnected to the hbsgrid, log out, and log in again, and you should be ready to use software modules.
NoMachine GUI Application menus
If you are currently using the Application menus in NoMachine, software modules does not apply to you at this time. At some future point, we will incorporate software modules into these menus, giving you the flexibility to choose an application and which version to run.
Wrapper Scripts and Custom LSF Job Submissions in the terminal
If you work or submit jobs primarily via command line, especially for batch jobs, software modules will now play a role in how you access software via your job submissions.
There are two ways to use software modules for your job submissions:
- Load software modules before submitting jobs, as the software environment is inherited by the job
- Load software modules in your job scripts
#1 is great when using interactive shells for development, one-off's, and exploratory work. This method is the only method for command-line wrapper scripts, as wrappers handle job submissions for you with pre-set defaults and commands. Additionally, Note: module integration with wrapper scripts will not be fully functional until September at the earliest. It may work for some wrappers (e.g. MATLAB), but not others (e.g. R, Python). Test carefully before using.
#2 is preferred for batch jobs and using submission scripts for your work: Including the
module load command inside the job script is documentation of software title and version, and is a good research data management practice. But it can be rather cumbersome to always write job submission scripts for single-command jobs (e.g.
bsub... Rstudio or
bsub... python3 myscript.py).
Note: we highly discourage adding module load commands to your
~/.bashrc login scripts. This not only skews our module usage metrics for software management, but this also might introduce problems to your software environment that may be hard to troubleshoot, esp. as software environments evolve over time (and in rare cases may even complicate your ability to log in).
No matter the approach, some basics will help demystify the process:
module avail command shows you what applications and versions are available (D indicates default versions):
[jharvard@rhrcscli2:~]$ module avail ----------------------- /usr/local/app/lmod/modulefiles ------------------------ AMPL/20200501 mathematica/11.3 spyder/3.6.5 (D) R/3.5.1 mathematica/12.1 (D) stata-mp4/15 Rscript/3.5.1 matlab/R2017b stata-mp4/16 (D) Rstudio/3.5.1 matlab/R2018a stata-mp8/15 SAS/9.1 matlab/R2019a stata-mp8/16 (D) anaconda/2_5.1 matlab/R2020a (D) stata-se/15 anaconda/3_5.1 (D) openoffice/4.1.6 stata-se/16 (D) conda-R/5.1 python/2.7.14 stattransfer/14 gitkraken/v1.8.4 python/3.6.5 (D) stattransfer/15 (D) gurobi/7.5.2 spyder/2.7.14 Where: D: Default Module Use "module spider" to find all possible modules. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
module load command enables a particular application in the environment, by adding the application to your
PATH variable, changing other environment variables, and/or pulling in dependencies. For example, to enable the R2019a version of MATLAB:
module load MATLAB/R2019a
or to use the default version (which here is R2020a):
module load MATLAB
Note: We highly recommend that you use the full title/version notation, as defaults will change over time.
Once a module is loaded in your session or inside your script, it is available just as though it had always been there:
[jharvard@rhrcscli2:~]$ module load matlab [jharvard@rhrcscli2:~]$ which matlab /usr/local/app/matlab/matlab_2020a/bin/matlab
Full details on
module load, and other
module commands are on our Software Modules page.
#1 Loading software modules before submitting jobs
As in the examples above, if one uses the module commands (
module load, etc.) in your terminal shell, this changes the environment for that shell and any submitted jobs. For example, for an interactive job:
module load matlab/R2020a bsub -q short_int -Is ... matlab -nodisplay -nojvm -nospash
or a background job:
module load matlab/R2020a bsub -q short ... matlab -nodisplay -nojvm -nospash -r my_matlab_script
In both cases, your job will inherit the environment (in particular, the execution
PATH), and your preferred version MATLAB will run.
This can be good while developing scripts or code, for running interactive application sessions (both GUI and/or terminal) from the terminal, or general putzing around. If you close that terminal window/session, these settings are lost, and you will have to perform the module load again to set up the environment.
#2 Loading software modules in your job scripts
Loading modules in a terminal/shell sets up the environment solely for the life of that session -- until the shell exits, you close the window, or the like. A better method for loading software is to use submission scripts to run jobs and to include the
module load command(s) in the script. This both guarantees the correct program and version is loaded when using the script at that time and in the future, and also serves as documentation for your work.
As a brief example, we create the file
my_submit_job.sh to run a job to execute a MATLAB script:
------ my_submit_job.sh ------- #!/bin/bash module load matlab/r2020a matlab -nosplash -nodestop -nojvm -r my_matlab_script ---------------------------
Now we submit this script to the scheduler:
bsub -q short bash my_submit_job.sh
Once the job runs (with automatic defaults 1 core and 5 GB RAM) the instructions in
my_submit_job.sh are run, and the module load will happen as the first command during the job.
For this last example, more experienced users might note:
bash my_submit_job.shcan be changed to
./my_submit_job.shif one makes the script executable with
chmod ug+x my_submit_job.sh.
bsubcommand line options (e.g. RAM options,
-Jjob name to hide script contents,
-Weestimated run time for backfill scheduling) can included in the top of the submission script as
#BSUBdirectives, and these will be parsed when submitting the script with the
bsub < scriptnameformat.
More details can be found on the Submitting Jobs page, including basic instructions and also more complete examples on writing job submission scripts.