Software Modules

Contents

 

Introduction

This document provides an overview of the Lmod software modules system, which allows user-controlled flexible usage of various software titles and versions, setting default software versions, and ease-of-use for user-installed software. The software modules are an improvement to the HBSGrid and will be fully integrated into the command-line and GUI user experience by the end of summer 2020.

About Software Modules

On the HBSGrid cluster, we have a variety of software applications available, including different versions of the same title and applications that are incompatible with one another. To ensure that users can access different titles, prevent conflicts, and set default versions of each software application, we accomplish this by using software modules.

After logging in to the cluster, one has a very basic software environment that provide little or no direct acces to research applications. One HBSGrid feature is the set of wrapper scripts, scripts found on the login nodes which provide a direct and seamless way to run an application on a compute node through a simple command or menu click. But these are hardcoded to specific titles / versions, to CPU cores and RAM, and provide little to no flexibility. With software modules, by loadingone or more modules for the applications, the programs are available to you /interactive-vs-batch-jobs. The modules update shell environment variables so that the system can find and use the applications of your choice. Modules replace the need for separate setup scripts, system-wide symlinks, and explicit export PATH= statement, all of which can cause problems for users.

Our implementation of environment modules uses Lmod software modules and the Bash shell, and this page provides more details: explanation of new features, the systematic naming and versioning convention, condensed Lmod details with HBS-specific information.

How to Use Modules

Note: At this time, the software module system is opt-in and is only accessible from the command-line. It will be phased in completely with the wrapper/default submission scripts by the end of the summer. To start using the module system, issue the command touch ~/.lmod-yes in a terminal when connected to the HBSGrid, and re-login. This one-time action is all that is needed!

You can load one or more modules using the module load [module(s) name] command. Some examples include:

$ module load matlab                                      # load default version
$ module load matlab/R2020a                               # load specific version
$ module load matlab/R2020a AMPL/20200501 gurobi/7.5.2    # multiple titles at once

Note: loading the default for a title may not be the latest version for that title.

To see what modules you have loaded, use module list:

$ module list

Currently Loaded Modules:
  1) matlab/R2020a   2) AMPL/20200501   3) gurobi/7.5.2

To unload a module, use module unload [module name]:

$ module unload gurobi

To unload all modules currently loaded, use module purge:

$ module unload gurobi

If you are just beginning with modules, one can turn on 'novice' mode to get more information when using the commands:

$ module --novice

Note: After using modules for some time, you might be tempted to include module load directives in your ~/.bash_profile or ~/.bashrc login scripts. DON'T! Not only will this skew our metrics for what modules are being used, but this may unintentionally introduce problems or incompatibilities as modules and software evolve over time. It is good documentation and a best practice for research data management to include module loads in your (Bash) cluster submission scripts or in working environment setup scripts, especially if one might be sharing scripts with other persons or when submitting research for publication. 

Finding Modules

To search for specific software titles or modules use the command module spider [searchterm]:

$ module spider matlab

----------------------------------------------------------------------------
  matlab:
----------------------------------------------------------------------------
     Versions:
        matlab/R2017b
        matlab/R2018a
        matlab/R2019a
        matlab/R2020a

----------------------------------------------------------------------------
  For detailed information about a specific "matlab" module (including how to load the modules) use the module's full name.
  For example:

     $ module spider matlab/R2020a
----------------------------------------------------------------------------

If you find there are modules you need that are not available, you may either install the software for yourself, or let us know if there is software that you believe will benefit many users by submitting a software install request. Please give as many details as possible to facilitate the request.

Loading a Module May Reveal More Modules

The module system is based on a module hierarchy, in order to keep sets of modules separate that are (binary) incompatible. For example, loading a compiler module (e.g. module load intel) will make available all the applications compiled with that version of the Intel compiler suite. This is done for each compiler choice, each MPI implementation choice, or suites of tools that make them incompatible with software compiled with other versions of the same compilers or different compilers.

Bear in mind that the module avail command does not show all the different modules you could possibly use, but only the ones that you could load in the current environment, given the modules already loaded. One should use module spider command to find specific modules.

Note: At this time, RCS has few, if any titles that fall into this category. But this may change as software titles evolved on the cluster.

Module Naming and Versioning

You can load specific versions by supplying the full module name:

$ module load matlab/R2020a

or let the module system automatically load the latest version:

$ module load matlab

The default is determined either according to alphanumeric sorting or, in some cases, the version we have chosen to be the default latest version based on stability or commonality.

It is strongly recommended to load modules specifying name and specific version. This way you are protected in case the default version changes, and specifying both the title and version number is proper documentation (and good research data management practices) for your research.

Version Changes When Loading Dependencies

In some cases, a given module may require another modules to be loaded. If you have a any version of the latter module loaded, the former module will usually consider the dependency satisfied, rather than force a specific different (often older) version of the needed one to be loaded. This has the potential to cause issues if a specific version is required. Be sure to look at the module’s requirements via module display [modulename] before assuming this will work. Or, at the very least, test your code and environment setup.

Writing Your Own Modules

If you are installing software in your home folder or project folder, you can also write your own modules to make these titles available to yourself or the project group. Please see the Lmod instructions for doing writing your own modules (and RCS will let you know that you rock!).

Further Reading

The Lmod group has excellent documentation on their ReadTheDocs Lmod site for everything you wish to know about Lmod. This includes:

  • an FAQ list
  • tips for user control of defaults and aliases
  • dependencies
  • and loading/saving default module sets.

And don't forget the handy Lmod Cheat Sheet.

We'd like to thank Harvard's FASRC for their module documenation which we've adapted for our use.

Updated 7/7/20