This page presents a list of Frequently Asked Questions (and answers) about using the HPC cluster systems at the University of Iowa.
- Why is my home account full even after I delete data from it? This occurs because the cluster utilizes snapshot backups of your home account. The snapshots provide protection against accidental deletion or corruption but mean that data isn't actually removed when you delete a file. This isn't an issue for most users but it can become problematic if you are running an application that writes and deletes a lot of data from your home account. There are few different potential solutions to this problem.
- If you think this is a problem that won't occur frequently contact the sysadmin team and ask them to delete the offending snapshots.
- For applications that do a lot of data read and write into a directory it is preferable to run the scratch file systems (/nfsscratch). The scratch file systems are not backed up. The scratch file systems are also much faster than your home account. This is also highly preferable from a system citizenship perspective because the bandwidth to the home accounts is less than is available to the scratch file systems.
Another option if a scratch volume does not work for your job for some reason is to use /localscratch which is the local hard drive of a compute node. This is a symlink to /state/partition1/localscratch if your jobs don't happen to like symlinks. This will likely also be faster than your home account and does not have snapshot backups. Note: This will only work if the data does not need to be available to multiple nodes at once.
The amount of /localscratch available on Helium is about 500GB. On Neon, each node has about 2.5TB available for /localscratch
- If you can't migrate to one of the cluster scratch volumes or /localscratch the sysadmin team can decrease the frequency of the snapshots on your home account. This can mitigate the problem if you have an idea of the write intervals of temporary data for your jobs. For instance if all of your jobs finish in a few hours we can set up so that we only snapshot your directory once per day at a time that makes sense with your typical utilization patterns.
If none of the above are viable the sysadmin team can turn off snapshots completely on your home account.
This means that accidental deletion or corruption of files WILL NOT be recoverable by the sysadmin team!
How do I access snapshot backups of my home account? Snapshot backups are accessible in ~/.zfs/snapshot. To restore a file copy it using the cp command from a dated subdirectory of the snapshot folder back to your home account. Here is an example of what you might expect to see during the restore process.
[brogers@helium-login-0-1:~]$ cd ~/.zfs/snapshot [brogers@helium-login-0-1:snapshot]$ ls zfs-auto-snap.daily-2011-02-10-00.00 zfs-auto-snap.monthly-2010-10-25-08.58 zfs-auto-snap.weekly-2011-01-22-00.00 zfs-auto-snap.daily-2011-02-11-00.00 zfs-auto-snap.monthly-2010-11-01-00.00 zfs-auto-snap.weekly-2011-01-29-00.00 zfs-auto-snap.daily-2011-02-12-00.00 zfs-auto-snap.monthly-2010-12-01-00.00 zfs-auto-snap.weekly-2011-02-01-00.00 zfs-auto-snap.hourly-2011-02-11-16.00 zfs-auto-snap.monthly-2011-01-01-00.00 zfs-auto-snap.weekly-2011-02-08-00.00 zfs-auto-snap.hourly-2011-02-11-20.00 zfs-auto-snap.monthly-2011-02-01-00.00 zfs-auto-snap.hourly-2011-02-12-00.00 zfs-auto-snap.weekly-2011-01-15-00.00 [brogers@helium-login-0-1:snapshot]$ cd zfs-auto-snap.monthly-2011-01-01-00.00/ [brogers@helium-login-0-1:zfs-auto-snap.monthly-2011-01-01-00.00]$ cp computeilos ~/
How do I see the status of just my jobs with qstat? The qstat command on Helium now defaults to showing the status of all jobs. However, to view the status of just your own or another user's jobs, one can pass the '-u' flag to qstat. So, to see the status of jobs submitted by user jdoe:
qstat -u jdoe
Why is my ~/.bashrc not being sourced? User accounts created before May 2, 2011 did not have the template ~/.bash_profile installed at account creation time. This is the file that contains the statement to source the ~/.bashrc file if it exists. A standard ~/.bash_profile file was installed on May 2, 2011 in user accounts that did not already have one. If you had already created your own ~/.bash_profile file and did not include a statement to source ~/.bashrc then it will not be sourced. To fix this, add the following to your ~/.bash_profile file.
~/.bash_profileif [ -f ~/.bashrc ]; then . ~/.bashrc fi