Share & transfer data using Globus
Globus is a fast and efficient tool for transferring data between institutions. You can use the Globus Online interface to initiate data transfer between institutions that have servers connected to Globus. Globus Online will work to complete the transfer without requiring further personal interaction, even if the transfer is interrupted.
If you need to transfer data directly to or from your personal computer, you can connect it to Globus by installing and running the Globus Connect Personal software. It is not possible to transfer files between two computers both running Globus Connect Personal clients.
For more information about Globus, visit https://www.globus.org/.
Getting started
Log in to Globus Online
Access Globus Online by visiting https://app.globus.org/file-manager. You can log in using your HawkID, or you can use your Globus account if you have one. To use your HawkID, select University of Iowa from the drop-down list of organizations. This will redirect you to the University of Iowa login page where you can log in with your HawkID.
Accessing data with Collections
Each source and destination for your data transfers is called a Collection.
At the University of Iowa, ITS-Research Services operates a collection called "uiowadata#data". This collection provides access to storage services at the following Paths:
Storage Service | Path |
---|---|
Your Argon HPC home directory | /hpchomes/argon/<hawkid>/ |
Dedicated Large Scale Storage shares | /Dedicated/<sharename>/ |
Shared Large Scale Storage shares | /Shared/<sharename>/ |
The NFS scratch filesystem | /nfsscratch/argon/ |
These paths are automounts, meaning they're not mounted or visible until triggered. So you might not see your directory when browsing, but if you use the full path anyway, it will trigger "just in time" and work as if present all along.
To transfer data to/from a personal computer, you will need to configure a private collection using the Globus Connect Personal software. This is described in Transfer data to your computer.
Sharing Data
You can share your data with collaborators by creating a Guest Collection. Please note that sharing data via a Guest Collection is not supported for all of the storage services accessible via the uiowadata#data collection.
Storage Service | Sharable via a Guest Collection |
---|---|
Your HPC home directory | No |
Dedicated Large Scale Storage shares | Yes |
Shared Large Scale Storage shares | Yes |
The NFS scratch filesystem | No |
Detailed instructions for sharing data from the uiowadata#data collection are available at Sharing data.
You can also share a link to a specific file (rather than an entire folder), as detailed in Get a sharable link for a file.
More about sharing data with Globus, including assigning roles to allow others to manage the permissions for a collection you create, is available at https://docs.globus.org/how-to/share-files/.
Transferring Files
Globus offers two main interfaces for transferring files. The Globus Online website serves as the graphical user interface (GUI), so it's accessible using your normal web browser. Alternately, if you're comfortable using the command line and ssh, or if you need to write scripts to transfer data, you can install and use the Globus command line interface.
Each method allows you to tell Globus which files to transfer, and the source and destination. When you specify your transfers, Globus will record them and start working on them, and there are various ways to monitor or check on them later.
Unless specified, file transfers are NOT encrypted. If you need transfers to be encrypted, you must enable encryption using the correct graphical menu option or command line flag.
Online GUI
After you log into the Globus Online website, you should land on the "File Manager" page. There is where you begin.
The sequence of steps for initiating a transfer is as follows:
Specify your first desired endpoint and path and click "Transfer or Sync to" in the right menu
The window should now be split with each endpoint.
Specify the second endpoint and path.
Then select the data you want to transfer.
Optionally, if you'd like a label for this transfer, you can enter it using the "Transfer & Sync Options" drop down menu at the bottom of the screen. Note that all transfers can be referenced by a unique ID, but you might find a label more convenient.
Optionally, if you need this transfer to be encrypted (or have other options enabled), it can also be found in the "Transfer & Sync Options" drop down at the bottom of the screen.
Finally, initiate the transfer by pressing the Start button with the arrow pointing from the source to the destination.
Globus command line tool
The Globus command line interface (CLI) provides an interface to Globus services from the shell, and is suited to both interactive and simple scripting use cases. More information about the CLI interface, including how to install it and full reference documentation for all CLI commands, can be found at https://docs.globus.org/cli/.
The command line client is available as a Python package named "globus-cli". The package is available from PyPI, so it's convenient to install (along with its dependent packages) using Python's "pip" command. If you use a Python virtual environments, activate your environment and then install the package like so:
pip install globus-cli
Alternately, if you don't use Python virtual environments, you can install the package into your home directory (but also see the next step to ensure you can access it after you install it):
pip install --user globus-cli
If you installed globus-cli using the --user
option, also make sure the command is available by running the following command (which you might also want to put in your shell startup config for convenience):
which globus >/dev/null 2>&1 || export PATH=${PATH:+$(python -c 'import site; print(site.USER_BASE)')/bin:$PATH}
If you have a web browser on the system where you're using the CLI, you can simply invoke the login subcommand to bring up a web page where you can use your HawkID to log in:
globus login
If you don't have access to a web browser on the system where you're running the CLI, you can use the --no-local-server
flag. You will be given a link to follow, and you must respond with the access code provided at that link. Note that this flag is implied if the CLI detects you are on a remote session.
globus endpoint search 'uiowadata'
ID | Owner | Display Name
------------------------------------ | ---------------------- | --------------
39dd0982-d784-11e6-9cd4-22000a1e3b52 | uiowadata@globusid.org | uiowadata#data