Helium currently offers Matlab for use on the clusterMatlab is currently available for use centrally on the HPC cluster system of the University of Iowa. One may use the Matlab environment by loading the appropriate module:.
[user@helum:~]$ module load matlab_R2012b
While it is possible to run Matlab interactively, which can be especially useful for prototyping a job, please remember that login nodes are a shared resource and are intended for launching jobs or prototyping smaller versions of jobs you intend to run. It It is not advisable to use interactive sessions for long-running, compute-intensive jobs unless one uses a qlogin session to do so.
Batch Matlab Jobs
A batch or serial job is generally run on a single node. The following is a very simple example of a non-parallel batch job using Matlab functions.
Info |
---|
Matlab's Parallel Toolbox does not work well if you set a shell preference in your SGE request. If you otherwise use a #$ -S <shell> designation in your SGE qsub scripts, it is best to remove or comment them when submitting parallel Matlab jobs. |
Info |
---|
This example demonstrates the -nojvm which disables some of Matlab's features in order to start faster and use less memory. This helps maintain the cluster's performance if you have a large batch of jobs which don't require the disabled features. |
Code Block | ||||
---|---|---|---|---|
|
#!/bin/bash |
...
# The name of the job: |
...
#$ -N MatlabTest |
...
# Name of the output log file: |
...
#$ -o matjob.log |
...
# Combining output/error messages into one file: |
...
#$ -j y |
...
# Specifying the Queue |
...
#$ -q UI |
...
# One needs to tell the queue system to use the current directory as the working directory |
...
#$ -cwd |
...
# The command(s) to be executed: |
...
matlab -nodisplay -nodesktop -nojvm -r batch |
...
# Note after -r is the name of the routine or function |
exit
0
Here is the Matlab function you are calling from within your job script:
Code Block | ||||
---|---|---|---|---|
|
...
A = fix(100*rand(5,6)); |
...
B = fix(100*rand(6,3)); |
...
C = A * B |
This will produce the following output in the specified log file (matjob.log):
No Format |
---|
< M A T L A B (R) >
Copyright 1984-2012 The MathWorks, Inc.
R2012b (8.0.0.783) 64-bit (glnxa64)
August 22, 2012
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
C =
12893 12980 15469
15263 19329 17603
7885 16894 17448
14434 20490 27329
9235 18735 22408 |
Parallel Matlab Jobs (or "pool jobs")
The current installation of Matlab on Helium uses the Parallel Toolbox, which allows for parallel jobs that can use up to 12 the total number of cores per job. Information Information on the Parallel Toolkit may be found here: http://www.mathworks.com/help/distcomp/index.html. The following example script can be used to submit a Matlab job to Helium SGE to run on 8 cores (you may use all the cores in a node – which may be more than 8). Note that lines starting with '#$' are SGE shell commands, whereas '#' symbols denote comments, and the remaining lines are matlab commands.
There is some overhead to running jobs in parallel, so it can be slower to run on multiple cores if the job is small without sufficiently large loops. The easiest way to make a Matlab program parallel compatible is to use parfor (parallel 'for') loops. A parfor loop can be used when each iteration of the loop is independent of all other iterations. Here is a link to Matlab documentation on parfor loops: http://www.mathworks.com/help/distcomp/getting-started-with-parfor.html
Submit the job using 'qsub parallel-test.sh'
, assuming the name of the submission script is called parallel-test.sh.
Info |
---|
Matlab's Parallel Toolbox requires Java, so if your code uses features of the toolbox, you must omit the -nojvm option when invoking matlab, otherwise your code will quit with an uninformative error. You can verify Java is available and otherwise deliberately quit with a helpful error message by adding this command before code which uses a feature requiring Java: error(javachk('jvm')) |
Code Block | ||||
---|---|---|---|---|
|
#!/bin/bash |
...
# The name of the job: |
...
#$ -N |
...
test ## replace 'test' with job name |
...
# Name of the output log file: |
...
#$ -o matlabTest_parfor.log |
...
# Combining output/error messages into one file ( change y to n for separate files) |
...
#$ -j y |
...
# Specify the parallel environment (PE) and number of cores to use (8): |
...
#$ -pe smp 8 |
...
# One needs to tell the queue system to use the current directory as the working directory |
...
#$ -cwd |
...
# The matlab commands to be executed; replace "test" with your function name. |
...
# -r imediately runs a function without presenting an interactive prompt |
...
/opt/matlab/R2012b/bin/matlab -nodisplay -nodesktop -nosplash -r parafor-test |
...
Within the Matlab script itself, you want to specify the number of cores you wish to reserve. This is done using the 'matlabpool('open', #)' command at the beginning of the file and ending the file with 'matlabpool('close')'. If you only want to use one core then you can omit the 'matlabpool' commands from your file.
Code Block | ||||
---|---|---|---|---|
|
...
matlabpool('open',8); |
...
tic |
...
start = tic; |
...
clear A |
...
parfor i = 1:100000; |
...
A(i) = i; |
...
end |
...
stop = toc(start); |
...
stop |
...
matlabpool('close'); |
Matlab engine for Python
To install Matlab engine for Python in order to invoke Matlab using matlab.engine from your Python code, see the notes specific to using Python.