PLEASE NOTE that installation of conda and miniconda in home directories is no longer necessary for Grid 2.5. Please contact RCS if you have any questions about the versions of R and Python that are installed on Grid 2.5. In addition, if you have used a conda or miniconda installation out of your home directory, we counsel you to transition to using the central installations of R, Python2, and Python3.
Compute Grid 2.5 is using Anaconda, the full distribution, and miniConda, a slimmed down version, to provide feature-rich environments for Python and R, respectively. This page briefly discusses some need-to-knows for each environment.
For the so-inclined or for those wishing more advanced capabilities, see Conda's Overview guide for general information, and their Managing... pages if you wish to use more of the advanced features. Please note that our central installations are read-only, so one would need to make any changes locally to home or project folders.
Compute Grid 2.5 offers both R and RStudio via the miniConda distribution. This provides CPU-optimized versions of R and its supporting libraries from CRAN.
Installing custom R packages
No special instructions are needed for this. Using the
install.packages() command from within R or RStudio will download and install the specified packages in your home folder by default.
Compute Grid 2.5 offers both Python2 (v2.7.x) and Python3 (v3.6.x) via the Anaconda distribution. This provides CPU-optimized versions of Python and its supporting libraries, including numpy, scipy, matplotlib, pandas, & scikitlearn, among others.
Python2 is still our default, so the commands python or python2 will both run python 2.7.x. For Python3, please use the command python3.
Installing custom Python modules
If you require a Python module that is not installed with the central Anaconda installation, you can install this yourself, and the module will be placed in a directory in your home folder. Due to the wrapper scripts that are installed on login nodes, this action must be performed via the back-end, compute nodes.
1. Set up an alias so it's easy to submit interactive jobs to back-end nodes:
alias my_run="bsub -app python-5g -q short_int -Is"
2. Install your Python module by prefixing the install command with
my_run python -m pip install --user SomeModule
OR, If you are using Python3, use the python3 command instead:
my_run python3 -m pip3 install --user SomeModule
The modules will be placed in the directory
$HOME/.local for use by your scripts and programs. If you are upgrading modules that are already installed centrally, insert
--upgrade before the
Please note that Jupyter Notebooks cannot be used on the HBSGrid due to security concerns. RCS is working with HBS IT and IT Security to investigate what options we have to provide this functionality. Although not ideal, the Python IDE Spyder, which is provided on the HBSGrid cluster, can be via "Run Selected Code" or "Run Current Line" menu items to emulate running a Jupyter Notebooks code cell.
We'll update the community when we have more information.
One of Anaconda's most useful features is the ability to create virtual environments. This is particularly helpful if you have multiple projects that depend on different versions of packages. With a virtual environment you can update the packages for one project without disturbing the packages of your other projects. Conda's documentation on managing environments is a good place to learn about this feature.
In addition, we'd like to bring a few important items to your attention:
Our version of Anaconda requires that you set the execution path (
export PATH…) to point at the central install in order for all the components to work correctly. This is especially true for installing modules that may require compiling from source code.
By default, new enivornments are placed in your home directory. Once you
source activateyour environment, the execution path should point towards your local (home directory) install.
- Environments cannot be used with the wrapper scripts in NoMachine and from the command line, as the execution path points at and prioritizes the central Anaconda installations. We hope this will be fixed in the next couple of months are we roll out better ways to manage the software versions and environments on the HBSGrid.
Our base installations of R, Python2, and Python3 have the packages and modules listed below as a part of the central installation. You can use the instructions above to install additional or updates ones in either your home folder (the default location) or in another location that you must manually specify.