Accessing Spaces & Folders via Terminal

Similar to using an SFTP GUI client, you can access and transfer files and folders using unix commands that utilize the SSH terminal protocols. This includes SFTP, SCP, and rsync.

scp

SCP can be a simple, quick way to transfer files between two computers. It's usage is simple, but the order that file locations are specified is crucial. SCP always expects the 'from' location first, then the 'to' destination. Depending on which is the remote system, you will prefix your username and server to one of the locations.

scp [username@server:][location of file] [destination of file]
or
scp [location of file] [username@server:][destination of file]

Below are some examples of the two most common uses of SCP to copy to and from various sources.

Note: We use "~" in the examples. The tilde "~" is a Unix short-hand that means "my home directory". So if user johnharvard uses ~/ this is the same as typing out the full path to his home directory (easier to remember than /export/faculty/johnharvard/ ). You can, of course, specify other paths (ex. - /export/projects/johnharvard/output/files.zip)

Copying files from the Grid to another computer
From a terminal/shell on another computer (like your local machine), you'll issue your SCP command and enter your Grid password.

scp johnharvard@researchgrid.hbs.edu:~/files.zip /home/johnharvard/
Password:
Enter PASSCODE:
files.zip 100% 9664KB 508.6KB/s 00:19

This copies the file files.zip from from your home directory on the Grid to the /home/johnharvard/ directory on the computer you issued the command from (your local computer).

Copying files from another computer to the Grid
From a terminal/shell on your non-Grid computer (like your local machine), you'll issue your SCP command and enter your Grid password.

scp /home/johnharvard/myfile.zip johnharvard@researchgrid.hbs.edu:~/
Password:
Enter PASSCODE:
files.zip 100% 9664KB 508.6KB/s 00:19

This copies the file files.zip from from the /home/johnharvard/ directory on the computer you issued the command on (like your local machine) to your home on the Grid.

While it's probably best to compress all the files you intend to transfer into one file, this is not always an option. To copy the contents of an entire directory, you can use the -r (for recursive) flag.

scp johnharvard@researchgrid.hbs.edu:~/mydata/ /home/johnharvard/mydata/
Password:
Enter PASSCODE:
files.zip 100% 9664KB 508.6KB/s 00:19

This copies all the files from ~/mydata/ on the Grid to the /home/johnharvard/mydata/ directory on the computer you issued the command from (like your local machine).

We thank FASRC for the origin of this material.

rsync

Rsync is a fast, versatile, remote (and local) file-copying tool. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. It is available on most Unix-like systems, including the Grid and Mac OS X. Windows implementations of rsync are available.

The basic syntax is: rsync SOURCE DESTINATION where SOURCE and DESTINATION are filesystem paths.

They can be local, either absolute or relative to the current working directory, or they can be remote but prefixing something like USERNAME@HOSTNAME: to the front of them.

NOTE: Unlike cp and most shell commands, a trailing / character on a directory name is significant — it means the contents of the directory as opposed to the directory itself.

Examples

  • As a replacement for cp — copying a single large file, but with a progress meter:
    rsync --progress bigfile bigfile-copy
  • Make a recursive copy of local directory foo as foo-copy:
    rsync -av foo/ foo-copy/

    The trailing slash on foo-copy/ is optional, but if it's not on foo/, the file foo/myfile will appear as foo-copy/foo/myfile instead of foo-copy/myfile.

  • Upload the directory foo on the local machine to your home directory on the Grid:
    rsync -avz foo/ MYUSERNAME@researchgrid.hbs.edu:~/foo/

    This works for individual files, too, just don't put the trailing slashes on them.

  • Download the directory foo in your home directory on the Grid to the local machine:
    rsync -avz MYUSERNAME@researchgrid.hbs.edu:~/foo .
  • Update a previously made copy of foo on the Grid after you've made changes to the local copy:
    rsync -avz --delete foo/ MYUSERNAME@researchgrid.hbs.edu.edu:~/foo/

    The --delete option has no effect when making a new copy, and therefore can be used the previous example, too (making the commands identical), but since it recursively deletes files, it's best to use it sparingly.

  • Update a previously made copy of foo on the Grid after you or someone else has already updated it from a different source:
    rsync -avz --update foo/ MYUSERNAME@researchgrid.edu:~/foo/

    The --update options has no effect when making a new copy, and can freely be specified in that case, also.

Compression

If the SOURCE and DESTINATION are on different machines with fast CPUs, especially if they're on different networks (e.g. your home computer and the Grid), it's recommended to add the -z option to compress the data that's transferred.

This will cause more CPU to be used on both ends, but it is usually faster.

File Attributes, Permissions, Ownership, etc.

By default, rsync does not copy recursively, preserve timestamps, preserve non-default permissions, etc.

There are individual options for all of these things, but the option -a, which is short for archive mode, sums up many of these (-rlptgoD) and is best for producing the most exact copy.
(-A (preserve ACLs), -X (preserve extended attributes), and -H (preserve hardlinks) may also be desired on rare occasions.)

Updating a Copy

Rsync's delta-transfer algorithm allows you to efficiently update copies you've previously made by only sending the differences needed to update the DESTINATION instead of re-copying it from scratch.
However, there are some addition options you will probably want to use depending on the type of copy you're trying to maintain.

If you want to maintain a mirror, i.e. the DESTINATION is to be an exact copy of the SOURCE, then you will want to add the --delete option.

This deletes stuff in the DESTINATION that is no longer in the SOURCE

Be careful with this option!

If you incorrectly specify the DESTINATION you may accidentally delete many files.

See also the --delete-excluded option if you're adding --exclude options that were not used when making the original copy.

If you're updating a master copy, i.e. the DESTINATION may have files that are newer than the versions in SOURCE, you will want to add the --update option.
This will leave those files alone, not revert them to the older copy in SOURCE.

Progress, Verbosity, Statistics

  • -v
    Verbose mode — list each file transferred.
    Adding more vs makes it more verbose.
  • --progress
    Show a progress meter for each file transfer (not a progress meter for the whole operation).
    If you have many small files, this can significantly slow down the transfer.
  • --stats
    Print a short paragraph of statistics at the end of the session, like average transfer rate, total numbers of files transferred, etc.

Other Useful Options

  • --dry-run
    Perform a dry-run of the session instead of actually modifying the DESTINATION.
    Most useful when adding multiple -v options, especially for verifying --delete is doing what you want.
  • --exclude PATTERN
    Skip some parts of the SOURCE.

We thank FASRC for the origin of this material.