Basic Job Submission

...

Basic Job Submission

The Helium cluster uses the Sun Grid Engine (SGE) queue scheduler system. The feature of a queue scheduler system that users interact with the most is that of job submission. The manual pages for SGE are very good and should be referred to for details. For this particular topic the qsub manual page is the authoritative source.

No Format
man qsub

This document provides a brief introduction to the most common options that might be used to submit jobs to the SGE system. It will focus on single processor jobs as that is the most basic case, but not necessarily the most common. Details on submission of parallel jobs is covered in Advanced in Basic Job Submission. AnchorJob ScriptJob Script

Job Script

While it is possible to submit programs directly to the SGE system (using the -b option flag) it is generally preferable to create a job script. The job script is just like any other script that you write in terms of specifying commands that you would like to run. The difference is that instead of a user running the script, it is submitted to SGE and SGE then runs the script. It does so after determining what nodes are available for it to run on. In its simplest form, the job script is simply a call to the program that performs the calculation.

Code Block

language	bash
title	Example job script, myscript.job

...

#!/bin/sh

...


# This is a very simple example job script

...


/Users/jdoe/my_program

...

Submitting the job

To run the above program on the cluster it must first be submitted to SGE. This is done with the qsub command.

No Format
qsub myscript.job

That will submit the job with all of the default options. For a parallel job, a parallel environment would need to be specified. This will be covered in more detail in Advanced in Basic Job Submission but but an example would look like

No Format
qsub -pe smp1 12 myscript.job

The default queue is set to be the UI queue, which has a 25 running jobs per user limit. If you have many single processor jobs to run it would be better to submit them to the all.q queue, which has no limit, but is subordinate to the other queues.

No Format
qsub -q all.q myscript.job

An alternative to specifying a qsub option on the command line is to put it into the job script. This is accomplished with the special prefix string of "#$" in the job script which siginifies a qsub directive follows.

Code Block

language	bash
title	Job script example with qsub directive

#!/bin/sh

...


# This is an example script showing how to specify qsub options

...


#$ -q all.q

...

The "#$ -q all.q" line will look like a comment to the shell interpreter but will be passed as a qsub directive when the job script is submitted to SGE. Any of the qsub options can be specified this way.

...

Info
If an option is specified on the command line it will supersede the specification in the job script file.

...

Commonly Used Options

...

Commonly Used Options

The following are some common options for the qsub command. For information on other options for more complex submissions, check the man pages with the command "man qsub".

qsub option

Description

-V

This imports your current environment to the job. This is set by default. As such, it does not have to be specified but is good to know about.

-N [name]

The name of the job. Make sure this makes sense to you.

-l h_rt=hr:min:sec

Maximum walltime for this job. You may want to specify this if you think your job may run out of control.

-r [y,n]

Should this job be re-runnable (default n)

-pe [type] [num]

Request [num] amount of [type] processors.

-cwd

Determines whether the job will be executed from the current working directory. If not specified, the job will be run from the user's home directory.

-S

Specify the shell to use when running the job script.

-e [path]

Name of a file or directory for standard error.

-o path

Name of a file or directory for standard output.

-j [y,n]

Merge the standard error stream into the standard output stream.

-pe [name] [n]

Parallel environment name and number of slots (cores).

-M email address

Set the email address to receive email about jobs. This will most likely be your University of Iowa email address/

-m b|e|a|s|n,...

Specify when to send an email message

'b' Mail is sent at the beginning of the job.

'e' Mail is sent at the end of the job.

'a' Mail is sent when the job is aborted.

's' Mail is sent when the job is suspended.

'n' No mail is sent.

Image Removed

Info
The "mail when job is suspended" option does not currently work.

...

Memory request

If you need a certain amount of memory to be available for your computation to start, you can request that with a resource request.

To request a particular quantity of memory, use the option:

...

Panel

title	Free Memory request

qsub -l mf=nG

or

qsub -l mf=nM

Info
Where n is the number of gigabytes or megabytes of memory you expect to use, respectively.

...

Output Redirection

It is often necessary or desired to redirect the standard output and standard error streams to files. This can be accomplished with the typical shell redirection calls but SGE also provides a mechanism for capturing the stdout and stderror streams. By default, the standard error is written to a file named named $JOB_NAME.e$JOB_ID and and the standard output is written to a file called called $JOB_NAME.o$JOB_ID. These can be set via the -e and -o flags flags to qsub, respectively.

...

Info
The standard error stream can be merged into the standard output stream with the '`-j y'` option of qsub.

The standard error and output streams are appended to the respective files so reusing a file will not destroy the previous contents. Another handy way to use this facility is to specify a directory for the target standard error and output streams. This will create the error and output files with the default filenames in the specified directory.

...

Note

Shell redirection, ie.

command > stdout.file

will over ride the 'qsub -o' setting.

...

Array Jobs

...

Anchor

...

	arrayjobs

	arrayjobs

An array job is a group of identical tasks that are differentiated only by an index number.

Each job in the array will inherit the same resource requests and attribute allocations as if they were entered independently as Batch Jobs. These jobs will be run concurrently, provided enough resources are available. Array jobs are created by adding the following option to the qsub command (or the #$ directive in the job script):

No Format
-t n[-m[:s]]

Where n=the lowest index number, m=the highest index number, and s=the step size. m and s are optional, which means that you could enter a single number (n), a simple range (n-m), or a range with step size (n-m:s).

The index number can be referenced within the script with the variable SGE_TASK_ID. There must be something mapping to the index number to make this useful. So, for example, say there are a set of files named, input1, input2, input3, input4, input5, input6, input7, input8 and render.job contains the following:

Code Block

language	bash
title	Example referencing task ID

...

#!/bin/sh

...


# Example showing how task arrays work

...


render input$SGE_TASK_ID

...

The input file rendered will be the one corresponding to what the task ID is for each task created. So, for example, if you wanted to run the script render.job 4 times, processing the files files input2, input4, input6, and and input8, you would enter:

No Format
qsub -t 2-8:2 render.job

render.sh would then be run 4 times, each with the default allocation of resources, with the input file corresponding to the basename + index number.

Versions Compared

Old Version 5

New Version 6

Key

Table of Contents

Basic Job Submission

Basic Job Submission

Job Script

Submitting the job

Commonly Used Options

Commonly Used Options

Memory request

Output Redirection

Output Redirection

Array Jobs

Page Comparison

Versions Compared

Old Version 5

New Version 6

Key

Table of Contents

Basic Job Submission

Basic Job Submission

Job Script

Submitting the job

Commonly Used Options

Commonly Used Options

Memory request

Output Redirection

Output Redirection

Array Jobs