...
Anchor | ||||
---|---|---|---|---|
|
The cleanup script currently runs every day looking for files that have not been accessed in the last 30 days. Basing this on access time works well in the case where files are being generated by computations on scratch and left there. However, in the case where files are moved to the scratch system from another source, the time stamps of the files from the source need to be considered.
There are 3 time stamps that get put As of July 1 2016, files on scratch filesystem are subject to deletion beginning 60 days after they were first written there. Home account storage and purchased storage are *not* subject to this policy. A file's age is the time elapsed since its creation timestamp ("crtime"), which is tracked on the fileserver. Note that this is distinct from the other timestamps on a file:
modification time (mtime): This is the time the contents of the file were last modified, for example, by editing it. The modification time can be seen with
...
access time (atime): This is the time the contents of the file were last accessed, for example, by viewing with 'less'. The access time can be seen with
ls -lu file
It is possible for the 3 timestamps to be different for the file. The most important point to remember, in light of the cleanup script using access time, is that the timestamps (including access time) may be carried over from an archive if the files are being extracted from onecreation time (crtime): This is the time the contents of the file were first written to the filesystem. This attribute is part of the underlying ZFS filesystem and is not visible to an NFS-mounted filesystem, nor accessible via standard Linux tools.
The cleanup script will run periodically on the server to delete files whose crtime is older than the policy. It is possible for all of these timestamps to be different for the file. Most archive utilities will maintain the first 3 timestamps, either by default or optionally. This includes using archive mode ('-a') with either 'cp' or 'rsync'. If the timestamps from the source are carried over then it could lead to a situation where your extracted files are deleted by the clean up script before 30 days have elapsed. This is because the access time stamp (carried over from the archive) is older than what you might be expecting.
There are several ways to update the timestamps but the easiest would be to use the 'touch' command. Even using that, there could be several ways to call it depending on the directory structure of the files you are extracting/moving/copying. One example would be to run the 'find' command in the top level directory of your newly extracted files.
Example of updating timestamps of extracted files
No Format |
---|
find /nfsscratch/Users/gpjohnsn/extracted/ -type file -print0 | xargs -0 touch |
After running the above, the files will have the current timestamp. It is possible, if desired, to update only the access timestamp by using the '-a' option to touch in the above command.
Panel | ||
---|---|---|
| ||
find /nfsscratch/Users/gpjohnsn/extracted/ -type file -print0 | xargs -0 touch -a |
...
However, note that an archive file's crtime is not affected by archive utilities at all over NFS.
Note |
---|
Duplicating files to update crtimes solely to circumvent the scratch cleanup process is against policy. |
Please contact hpc-sysadmins@iowa.uiowa.edu if you need assistance with this.