Subsections
2.6 Using The Portable Batch System (PBS)
The PBS architecture consists of three major components:
- The PBS server. This runs on the OSCAR head node. It controls
the submission and running of jobs.
- The Maui scheduler. This takes care of scheduling jobs across
the cluster according sophisticated algorithms.
- A ``mom'' daemon on each cluster node. The moms are responsible
for actually starting and stopping jobs on the client nodes.
All PBS commands can be found under /usr/local/pbs/bin on the
OSCAR head node. There are man pages available for these commands,
but here are the most popular with some basic options:
- qsub: submits job to PBS
- qdel: deletes PBS job
- qstat [-n]: displays current job status and node
associations
- pbsnodes [-a]: displays node status
- pbsdsh: distributed process launcher
The qsub command is not necessarily intuitive. Here are some
handy tips to know:
- Be sure to read the qsub man page.
- qsub only accepts a script filename for a target.
- The target script cannot take any command line arguments.
- For parallel jobs, the target script is only launched on ONE
node. Therefore the script is responsible for launching all
processes in the parallel job.
- One method of launching processes is to use the pbsdsh
command within the script used as qsub's target.
pbsdsh will launch its target on all allocated processors and nodes
(specified as arguments to qsub). Other methods of parallel
launch exist, such as mpirun, included with each of the MPI packages.
- Job parameters can be specifed to qsub on the command
line, or within the submitted script. You can get a good start by
looking at examples provided by the OSCAR test suite. Ask your
system administrator if you would like to see these. They can
likely be found in the home directory of the ``oscartst''
user.
Here is a sample qsub command line:
$ qsub -N my_jobname -e my_stderr.txt -o my_stdout.txt -q workq -l \
nodes=X:ppn=Y:all,walltime=1:00:00 my_script.sh
Here is the contents of the script.sh file:
#!/bin/sh
echo Launchnode is `hostname`
pbsdsh /path/to/my_executable
# All done
Alternatively, you can specify most of the qsub parameters in
script.sh itself:
$ qsub -l nodes=X:ppn=Y:all,walltime=1:00:00 script.sh
Here is the contents of the script.sh file:
#!/bin/sh
#PBS -N my_jobname
#PBS -o my_stdout.txt
#PBS -e my_stderr.txt
#PBS -q workq
echo Launchnode is `hostname`
pbsdsh /path/to/my_executable
# All done
Notes about the above examples:
- ``all'' is an optional specification of a node attribute,
or ``resource''.
- ``workq'' is a default queue name that is used in OSCAR
clusters.
- 1:00:00 is in HH:MM:SS format (although leading zeros are optional).
root
2002-11-08