Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Basic Job Submission

The Helium cluster uses The HPC cluster systems use the Sun Grid Engine (SGE) queue scheduler system. The feature of a queue scheduler system that users interact with the most is that of job submission. The manual pages for SGE are very good and should be referred to for details. For this particular topic the qsub manual page is the authoritative source.

No Format
man qsub

This document provides a brief introduction to the most common options that might be used to submit jobs to the SGE system. It will focus on single processor jobs as that is the most basic case, but not necessarily the most common. Details on submission of parallel jobs is covered in Basic Advanced Job Submission.

Job Script

While it is possible to submit programs directly to the SGE system (using the -b option flag) it is generally preferable to create a job script. The job script is just like any other script that you write in terms of specifying commands that you would like to run. The difference is that instead of a user running the script, it is submitted to SGE and SGE then runs the script. It does so after determining what nodes are available for it to run on. In its simplest form, the job script is simply a call to the program that performs the calculation.

...

That will submit the job with all of the default options. For a parallel job, a parallel environment would need to be specified. This will be covered in more detail in Basic Advanced Job Submission but an example would look like

...

The default queue is set to be the UI queue, which has a 25 running jobs per user limit on Helium and a 10 running jobs per user limit on Neon. In addition, Neon has a high memory queue (UI-HM) with a 4 running jobs per user limit. If you have many single processor jobs to run it would be better to submit them to the all.q queue, which has no limit, but is subordinate to the other queues.

...

An alternative to specifying a qsub option on the command line is to put it into the job script. This is accomplished with the special prefix string of "#$" in the job script which siginifies signifies a qsub directive follows.

...

The "#$ -q all.q" line will look like a comment to the shell interpreter but will be passed as a qsub directive when the job script is submitted to interpreted by SGE. Any of the qsub options can be specified this way.

...

The following are some common options for the qsub command.   For information on other options for more complex submissions, check the man pages with the command "man qsub".

 qsub optionDescription 
-VThis imports your current environment to the job. This is set by default. As such, it does not have to be specified but is good to know about.
-N [name]The name of the job. Make sure this makes sense to you. 
-l h_rt=hr:min:secMaximum walltime for this job. You may want to specify this if you think your job may run out of control.
-l h_vmem=bytes

You can specify a unit as well. For example,

-l h_vmem=2G

An appropriate value will be set for your job if an entire node is not requested.

-r [y,n]Should this job be re-runnable (default n)-pe [type] [num]Request [num] amount of [type] processors.
-cwdDetermines whether the job will be executed from the current working directory. If not specified, the job will be run from the user's home directory.
-SSpecify the shell to use when running the job script.
-e [path]Name of a file or directory for standard error.
-o [path]Name of a file or directory for standard output.
-j [y,n]Merge the standard error stream into the standard output stream.
-pe [name] [n]

Parallel environment name and number of slots (cores).

-M email addressSet the email address to receive email about jobs. This will most likely be your University of Iowa email address/
-m b|e|a|s|n,...

Specify when to send an email message

'b'     Mail is sent at the beginning of the job.

'e'     Mail is sent at the end of the job.

'a'     Mail is sent when the job is aborted.

's'     Mail is sent when the job is suspended.

'n'     No mail is sent.

Info

The "mail when job is suspended" option does not currently work.

...

Panel
titleFree Memory request

qsub -l mf=nG

or

qsub -l mf=nM

Info

Where n is the number of gigabytes or megabytes of memory you expect to use, respectively.

Info

The mem_free request is only applicable at scheduling time. It is not a limit.

Output Redirection

It is often necessary or desired to redirect the standard output and standard error streams to files. This can be accomplished with the typical shell redirection calls but SGE also provides a mechanism for capturing the stdout and stderror streams. By default, the standard error is written to a file named $JOB_NAME.e$JOB_ID and the standard output is written to a file called $JOB_NAME.o$JOB_ID. These can be set via the -e and -o flags to qsub, respectively.

...

Note

Shell redirection, ie.

command > stdout.file

will over ride override the 'qsub -o' setting.

Array Jobs

...

Each job in the array will inherit the same resource requests and attribute allocations as if they were entered independently as Batch Jobs.  These jobs will be run concurrently, provided enough resources are available.   Array jobs are created by adding the following option to the qsub command (or the #$ directive in the job script):

...