Job Submission
Wooki is a shared resource which is essential for the research projects of all group members. To help ensure Wooki’s smooth operation please read the following set of guidelines.
Most Important Rules and Guidelines
- Always use the queuing system for running jobs.
-
Use your scratch directory when running jobs.
- Don’t store large restart files or intermediate results on your home directory. (It wastes resources during backups)
- If you are using a standard package, such as VASP, Gaussian or NAMD check the module help for specific information and use the submit scripts that have been written to submit jobs to the queue. There are submit scripts for each of the most commonly used code packages.
- Don’t run long background or interactive jobs on the frontend. The frontend is reserved for compiling, visualization, data analysis and job setup. If you need more than 15 minutes of CPU time, write a script to submit your job to the queue.
- Set the estimated job time so the queue can schedule effectively.
The Queuing System on Wooki
The queuing system controls the scheduling, distribution and execution of jobs on the cluster. On Wooki, all compute jobs should be run through the queuing system. Rocks is fully integrated with the Gridengine queue system, also known previously as Sun Grid Engine and Oracle Grid Engine, for historical reasons this document will refer to it as SGE.
Basic Commands for Viewing the Status of the Queue
All commands are documented with help by adding the -h option (i.e. qstat -h) or using the man command (i.e. man qstat).
qstat
Displays the current status of your running and waiting jobs on the queue.
-
qstat -u \* shows jobs for all users in priority order.
-
qstat -f shows the nodes that each job is running on.
-
qstat -g c shows the number of nodes free in the system
We are currently in the process of developing some customised scripts to display usage on the cluster in a friendlier manner. For full information or to make requests for functionlaity see the development page: https://bitbucket.org/tdaff/wsge/ where you can also open issues for feature requests. Currently commands include:
-
wstat gives a more user friendly list of all jobs on the cluster;
-
wstat -u username only shows username‘s jobs.
-
-
cluster_stat shows a comprehensive overview of jobs and users on the cluster.
Submitting Jobs on Wooki
For most codes that come as a module there will be a $code-submit script that takes care of building an SGE script and submitting the job, for example vasp-submit or gaussian-submit. Please use these if they are available. These scripts ensure the proper usage of the local scratch disks and for parallel jobs, they set up the parallel execution environments and message passing systems.
The submit scripts are located with their modules in: /share/apps, but when you load the module they will be available as commands in your shell.
If a commonly used program is missing a submit script, you may open a ticket to ask for one to be written.
Basic Usage of Submit Scripts
To show the basic usage and behaviour of the submit scripts we will use the example of submitting a vasp job. Most submit scripts operate in a very similar way.
1 $ module help vasp
2
3 ----------- Module Specific Help for 'vasp/5.3.5' -----------------
4
5 Provides VASP 5.3.5
6
7 Use `module load vasp` to load the package then
8 use the command `vasp-submit` to run jobs through
9 the queue.
10 $ module load vasp
11 $ vasp-submit --help
12 usage: vasp-submit [-h] [-i] [-r RUNTIME] [-t] [-g] [-d] job_name [num_cpu]
13
14 Submit a vasp job.
15
16 positional arguments:
17 job_name vasp job name
18 num_cpu number of processosr to run on (default: 1)
19
20 optional arguments:
21 -h, --help show this help message and exit
22 -i, --infiniband run on infiniband for large parallel jobs
23 -r RUNTIME, --runtime RUNTIME
24 maximum length of time for the job to run, specify as
25 number of seconds or hh:mm:ss
26 -t, --threaded keep the processes in a single node
27 -g, --gamma use the gamma point vasp executable
28 -d, --debug print script rather than submit
29
30
31 $ vasp-submit -g -r 24:00:00 surface 16
- This submits a gamma point version vasp job on 16 CPUs to run for no more than a day. Be aware that some scripts requre the input file name, rather than just a job name.
-
You can run the command module avail to give a list of all installed packages.
-
Running module help ... will give you any other important instructions for that software.
-
The --help option of the submission script will describe any extra requirements for those jobs.
- It is recommended that the running time be specified for all jobs so that they can be scheduled effectively. Be aware that the jobs get killed after this time.
Submitting jobs for other software pacakges will be very similar, but there may be some subtle differences. For example gaussian-submit takes the input filename as an argument and will check that your input file requests the correct number of CPUs.
1 $ module help gaussian
2
3 ----------- Module Specific Help for 'gaussian/g09_D.01' ----------
4
5 Provides Gaussian g09_D.01 for Intel processors
6
7 To load the Gaussian package, use this command:
8 > module load gaussian
9 To run jobs through the queue, use this command:
10 > gaussian-submit input_file #CPU
11
12 $ module load gaussian
13 $ gaussian-submit --help
14 usage: gaussian-submit [-h] [-r RUNTIME] [-d] input_file [num_cpu]
15
16 Submit a gaussian job.
17
18 positional arguments:
19 input_file gaussian input file name
20 num_cpu number of processosr to run on (default: 1)
21
22 optional arguments:
23 -h, --help show this help message and exit
24 -r RUNTIME, --runtime RUNTIME
25 maximum length of time for the job to run, specify as
26 number of seconds or hh:mm:ss
27 -d, --debug print script rather than submit
28
29 $ gaussian-submit benzene.gjf 2
30 Input file requests too many CPUs (6), not submitting
31 $ gaussian-submit benzene.gjf 6
Output Files and Log Files
The output and/or logfiles will be written to a file in the directory where the job was submitted. The files will be called “jobname.out” or “jobname.log”. For example, if the input file that was submitted was called “pd_test.in”, then the output file would be “pd_test.out”. The output and/or log files can be viewed during job execution from the headnode to examine the progress of the simulation.
Terminating a Job on the Queue
Use the qdel command to terminate running or waiting jobs on the queue. For example:
1 $ qstat
2 qstat
3 job-ID prior name user state submit/start at queue slots ja-task-ID
4 -----------------------------------------------------------------------------------------------------------------
5 1200305 0.62499 vasp.N2_SI tdaff r 09/30/2013 13:37:50 all.q@compute-7-26.local 18
6 1207188 0.62499 gulp.C1Fre tdaff r 10/02/2013 22:42:22 all.q@compute-7-17.local 1
7 1209015 0.62499 fastmc.C3F tdaff r 10/02/2013 23:18:22 all.q@compute-1-4.local 1
8
9 $ qdel 1200305
When a job is terminated in this fashion, it is not a ‘clean’ termination of the job and the working directory is left on the execution node in its scratch directory.
Occasionally, a job will not leave the queue even with repeated qdel commands issued. If a job is stubborn, you can use qdel -fto force the deletion (takes a couple of minutes). If this does not work one of the nodes that the job was running on has likely crashed. Please open a ticket to report the problem.
Location of Restart Files After Job Completion
By default, upon job completion the script will copy the contents of the execution directory back to the submission directory, if this has not occurred there will be a file where_is_my_job that has commands to take you to the files:
$ ls input.in output.out where_is_my_job $ cat where_is_my_job ssh compute-5-12 cd /state/partition1/tdaff/1209015 $ ssh compute-5-12 $ cd /state/partition1/tdaff/1209015 $ ls input.in restart.file intermediate.bin
Location of the Execution Directory
Jobs will usually run from the directory they are submitted (know as the $SGE_O_WORKDIR). By running jobs from the scratch drives, less load will be placed on the frontend drives to keep it responsive for interactive users. If you have trouble identifying your jobs you can track extended information from the qstat -j command:
$ qstat -j 3233675 ============================================================== job_number: 3233675 exec_file: job_scripts/3233675 submission_time: Fri Apr 11 11:44:45 2014 ... sge_o_shell: /bin/bash sge_o_workdir: /share/scratch/tdaff/simulations/production/structure_17 sge_o_host: wooki ...
For jobs with heavy file operations, the job should be set up to run from the local drive of the node the jobs runs on. If your code has a well made submission script, a file called where_is_my_job will be created which gives commands to find the directory that the job is running from.
SGE Scripts
The script generated by the sumission command has the elements detailed below:
1 #!/bin/bash
2
3 ### Execute where the job was submitted
4 #$ -cwd
5 ### Carry over current environment
6 #$ -V
7 ### Put output and error in the same file
8 #$ -j y
9 ### Name the output
10 #$ -o jobname.stdout
11 ### Hard limit on wallclock time
12 #$ -l h_rt=168:00:00
13 ### Make a reservation of nodes
14 #$ -R y
15 ### Job name in the qstat list
16 #$ -N jobname
17 ### Choose an appropriate parallel environment
18 #$ -pe ppn12_* 48
19
20 export OMP_NUM_THREADS=1
21
22 ulimit -s unlimited
23
24 echo "Job ID $JOB_ID "
25
26 echo "Running on hosts: "
27
28 cat $PE_HOSTFILE
29
30 mpirun -np $NSLOTS /path/to/my/exe >> job.out
Parallel environments
Parallel environments define the type of nodes that your job will run on. They are grouped so that jobs do not span different node types or switches and you can select which to use with wildcards (like ‘*’) in your job submission script. A set of environments with set numbers of processes per nodes are defined:
-
ppn4
-
ppn6
-
ppn8
-
ppn12
with different node types
-
opteron-8220
-
xeon-ibm
-
gpu-m2075
-
xeon-e5645
-
xeon-x5650
In most cases it is desirable to reduce spanning and use a high ppn and let the scheduler find the nodes for you. For example, to request 16 cores with 8 processes per node include #$ -pe ppn8_* 16 in my submission script. Or to run only on Xeons with any number of processes per node, use #$ -pe ppn*_xeon* 16. There are a few special cases:
-
threaded: Make all the processes run on the same node. The largest node has 16 cores, most have 12 or less. This will give best performance if you can use it.
-
ppn4_infiniband*: Use the Infiniband cluster; these nodes only have 4cores but faster interconnects for better scaling of larger jobs.
-
orte: Spread processes all over the cluster. Not recommended!
SGE scripts
Sometimes you need to submit a job onto a node that requires a large amount of memory. Say you need to submit a job that requires 24 GB of memory but the basic qsub command keeps sending the job to a node and returning memory allocation errors. If you use qsub -l mem_free=24G your job will be sent off to a node that has a minimum of 24GB of free memory.