Python, R, Packages, and the Grid

If you think Python 2.6.6 or R 3.0.2 sound old, I have good news for you. The Conda package manager makes it easy to install the latest and greatest Python and R packages in your home directory. Conda facilitates the installation of Python 2.7.11, Python 3.5.2, and R 3.3.1.


The first step to installing Python or R on the Grid is to install Miniconda. The steps below outline how to install Miniconda.

  1. Log in to the grid:

  2. Set up an alias so it's easy to submit interactive jobs to back-end nodes:

    alias my_run="bsub -app generic-5g -q interactive -Is"
  3. Download the Miniconda installer:

    my_run wget
  4. Run the installer

    chmod +x
    my_run ./

    Here's how I answered the questions when running the installer:

    1. Do you approve the license terms? yes
    2. I pressed enter to install Miniconda2 in ~/miniconda2
    3. Do you wish the installer to prepend the Miniconda2 install location to PATH in your ~/.bashrc ? no                                                                                                                                                                                                                                                                                                                                                               
  5. Make sure the Miniconda bin is on your search path:

    export PATH="$HOME/miniconda2/bin:$PATH"

    If you skip this step, trying to use Conda will throw error messages like -bash: conda: command not found. It is also useful to put the same export command in your ~/.bash_profile file. This way when you log into the grid and call python in the future it will find the python in ~/miniconda2/bin first rather than using the old version in /usr/local/bin.                                                                                                                                                                                                                                  

  6. Remove the Miniconda installer:

    rm -f


Note this step requires that you have already installed Miniconda. If you have not installed Miniconda yet, return to Section .

The following command installs the typical Python packages used in social science research:

my_run conda install anaconda

If the Python packages included in Anaconda are insufficient for your needs, Conda's documentation on managing packages has excellent information on how to install additional packages. The general approach to installing additional packages proceeds as follows:

  1. See if the package is available through Conda with conda search. If it is, install the package using conda install.

  2. See if the package is available on If it is, install the package using conda install being sure to specify the correct channel.

  3. To install a non-conda package, use pip to install the package.


Note this step requires that you have already installed Miniconda. If you have not installed Miniconda yet, return to Section .

The following command installs the typical R packages used in social science research:

my_run conda install -c rdonnellyr r-essentials

Notice that the above command installs r-essentials from the rdonnellyr channel. This channel is managed by Ray Donnelly, who works at Continuum Analytics - the makers of Conda. This package contains the latest version of R. The version of R available through the r channel is slightly out of date.

If you need additional R packages the general approach proceeds as follows:

  1. Search If the package is available use conda install specifying the appropriate channel.

  2. If the package is available through CRAN but not Conda, you can create a new Conda package from the CRAN repository. For documentation on this process see building conda packages and conda skeleton cran. This looks like a bit of work to do properly.

  3. This Stack Overflow answer provides a quick and dirty workaround if you don't want to build new Conda packages. The key insight is to open R and use install.packages being sure the specify the correct path for where to install the package, something like:

    install.packages("rstan", lib = "~/miniconda2/lib/R/library")


One of Conda's most useful features is the ability to create virtual environments. This is particularly helpful if you have multiple projects that depend on different versions of packages. With a virtual environment you can update the packages for one project without disturbing the packages of your other projects. Conda's documentation on managing environments is a good place to learn about this feature.


Now that you have installed python and R in ~/miniconda2/bin/ you need to run these programs using bsub commands so your computationally intense jobs are run on back-end nodes rather than on front-end nodes. Below I give a quick introduction to submitting batch and interactive jobs through LSF.


To start let's create an alias describing a bsub command for submitting batch jobs. If you want to learn more about bsub go to this page in the documentation.

alias my_batch="bsub -app generic-5g -q normal"

If you want to run a Python script named you would run:

my_batch ~/miniconda2/bin/python

Notice that it's important to give the full path to your installation of Python. Similarly, here is how to run an R script named your_file.R:

my_batch ~/miniconda2/bin/Rscript your_file.R


Note, I define the my_run alias used below in Section .

There are a lot of ways to run interactive Python and R jobs on the Grid. I'm going to highlight the most enjoyable ways:

Jupyter Console

If you want to work at the command line, the Jupyter Console makes interactive work quite pleasant and it works with both Python and R. To run Python use:

my_run jupyter console

To run R use:

my_run jupyter console --kernel=ir

Jupyter Notebook

If you want to work in the browser and do interactive data visualization, the Jupyter Notebook is the way to go. It works with both Python and R, you just have to specify the language when creating a new notebook. To run the Jupyter Notebook use:

my_run jupyter notebook --browser=firefox

RStudio Desktop

It is possible to run RStudio Desktop on the grid. Here is how I run RStudio (note the spelling of the Rstudio command has a capital R and a lower-case s):

export RSTUDIO_WHICH_R=$HOME/miniconda2/bin/R

Unfortunately, installing RStudio on the Grid is quite challenging and the currently installed version is quite old. You're likely to have a more pleasant interactive experience using the Jupyter Notebook, which is easy to install.