Table of Contents
The HPC cluster systems use the Sun Grid Engine (SGE) queue scheduler system. The feature of a queue scheduler system that users interact with the most is that of job submission. The manual pages for SGE are very good and should be referred to for details. For this particular topic the qsub manual page is the authoritative source.
No Format |
---|
man qsub |
This document provides a brief introduction to the most common options that might be used to submit jobs to the SGE system. It will focus on single processor jobs as that is the most basic case, but not necessarily the most common. Details on submission of parallel jobs is covered in Advanced Job Submission.
...
That will submit the job with all of the default options. For a parallel job, a parallel environment would need to be specified. This will be covered in more detail in Advanced Job Submission but an example would look like
No Format |
---|
qsub -pe smp1smp 12 myscript.job |
The default queue is set to be the UI queue, which has a 25 running jobs per user limit on Helium and a 10 running jobs per user limit on Neon. In addition, Neon has a there is high memory queue (UI-HM) with a 4 running jobs per user limitfor large memory jobs. If you have many single processor jobs to run it would may be better to submit them to the all.q queue, which has no limit, but is /wiki/spaces/hpcdocs/pages/76513448 to the other queues.
...
qsub option | Description | ||||
---|---|---|---|---|---|
-V | This imports your current environment to the job. This is set by default. As such, it does not have to be specified but is good to know about. | ||||
-N [name] | The name of the job. Make sure this makes sense to you. | ||||
-l h_rt=hr:min:sec | Maximum walltime for this job. You may want to specify this if you think your job may run out of control. | ||||
-l h_vmem=bytes | You can specify a unit as well. For example, -l h_vmem=2G An appropriate value will be set for your job if an entire node is not requested. | ||||
-r [y,n] | Should this job be re-runnable (default n) | ||||
-cwd | Determines whether the job will be executed from the current working directory. If not specified, the job will be run from the user's home directory. | ||||
-S | Specify the shell to use when running interpreting the job script. | ||||
-e [path] | Name of a file or directory for standard error. | ||||
-o [path] | Name of a file or directory for standard output. | ||||
-j [y,n] | Merge the standard error stream into the standard output stream. | ||||
-pe [name] [n] | Parallel environment name and number of slots (cores). | ||||
-M email address | Set the email address to receive email about jobs. This will most likely be your University of Iowa email address. | ||||
-m b|e|a|s|n,... | Specify when to send an email message 'b' Mail is sent at the beginning of the job. 'e' Mail is sent at the end of the job. 'a' Mail is sent when the job is aborted. 's' Mail is sent when the job is suspended. 'n' No mail is sent.
|
...
It is often necessary or desired to redirect the standard output and standard error streams to files. This can be accomplished with the typical shell redirection calls but SGE also provides a mechanism for capturing the stdout and stderror stderr streams. By default, the standard error is written to a file named $JOB_NAME.e$JOB_ID and the standard output is written to a file called $JOB_NAME.o$JOB_ID. These can be set via the -e
and -o
flags to qsub, respectively.
...