Archiving Your Research Files

Archive your files to an external drive or DVD by doing the following:

  • Put the data or project files in a subdirectory
  • Run the script called make-archive which compresses your data
  • You will receive an email confirming the success of the operation
  • Check the data and burn to a DVD or external drive.

What files should be archived?

  • research project-related files after the project is done
  • data you are finished with but want to keep
  • files you rarely need to use

Example:  Jane Doe has three projects.  Each project is in a separate subdirectory on research.hbs.edu. In a terminal session:

researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ ls -l
drwx------ 5 jdoe faculty 96 Oct 5  2015  AcmeCase
drwx------ 16 jdoe faculty 8192 Oct 3  09:17 airlineDat
drwx------ 2 jdoe faculty 01 Oct 23 13:16 bank_proj
researchgrid$ 

To see how much data (in kilobytes) is in each directory:

researchgrid$ du -sk *
12976   AcmeCase
340093  airlineDat
880144  bank_proj

The listing above shows that the directories contain about 13, 340 and 880 megabytes, respectively.

From the parent directory of the subdirectories you wish to archive, run a script by typing make-archive <dirname>.  In this example, we archive the contents of the AcmeCase directory:

researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ make-archive AcmeCase

The server shortly responds with this message:

Creating compressed archive file"/export/scratch/archive_jdoe/AcmeCase.tar.bz2" from all items in directory "AcmeCase"
The total data size (in KB) of the source is: 12976
Compression done.
Now testing archive and making table of contents.
Done.
Mail has been sent to jdoe@hbs.edu with details of this archive operation.
The size of the compressed archive is 2797 KB.

**NOTE: If the size of the compressed archive is more than 3,500,000 KB (3.5 GB), the files should be divided across two or more subdirectories. The make-archive script is then run on each subdirectory.

Here is what takes place when the make-archive script is run:

  • The script creates an archive directory with your HBS intranet username in the /export/scratch directory named archive_username.
  • In the above example, all of the files in the AcmeCase directory were compressed and copied to the archive directory.
  • You will receive an email confirming the operation.
  • A text file named <username>.info.txt containing a list of the filenames and uncompressed sizes will also be included in your archive directory. Instructions about uncompressing the files for use at a future time is included in that same text file.

Burn the compressed file to DVD or an external drive along with a copy of the text file since that file contains instructions on how to uncompress your data. Once you have copied the compressed file and the text files, you should delete the original data files and directory from your account area:

researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ rm AcmeCase/*.*
researchgrid$ rmdir AcmeCase

Any questions? Please contact Research Computing Services. And thank you for helping manage and conserve space on the research grid!