Every PBS script is just a shell script with special #PBS lines at the top. The scheduler reads those lines and decides where and how to run your job.
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
cd $PBS_O_WORKDIR
echo "Hello from the cluster!"
| Line | What it does | Can I skip it? |
|---|---|---|
#!/bin/bash | Tells the system to use bash. Always the very first line. | ❌ No |
#PBS -q default | Which queue to run in. If skipped, scheduler picks one. | ⚠️ Not recommended |
#PBS -l nodes=1:ppn=1 | How many nodes and CPU cores your job needs. | ⚠️ Not recommended |
cd $PBS_O_WORKDIR | Goes to the folder where you ran qsub. Without this your script won't find your files. | ❌ Always include this |
All #PBS lines must come before any real command. Any directive placed after a command is silently ignored by the scheduler — no error, it just won't work.
#!/bin/bash
#PBS -N MyJob ← ✅ this works
#PBS -q default ← ✅ this works
cd $PBS_O_WORKDIR ← first real command
#PBS -l nodes=1:ppn=4 ← ❌ too late, ignored!
Every possible PBS option is documented in the built-in manual. Run this on the login node:
man qsub
It lists every valid #PBS directive, all options, and environment variables available to your job.
A queue determines how long your job can run and how many cores you can use. Pick the right one — using a longer queue than you need wastes shared resources.
| Queue | Max Walltime | Max Cores per User | Use For |
|---|---|---|---|
default | 8 hours | 200 cores | Quick jobs, testing, short calculations |
short | 72 hours (3 days) | 100 cores | Medium length production jobs |
long | 1080 hours (45 days) | 100 cores | Long running simulations |
infinity | 4380 hours (~6 months) | 50 cores | Very long calculations — use sparingly |
#PBS -q default # quick tests
#PBS -q short # up to 3 days
#PBS -q long # up to 45 days
#PBS -q infinity # up to ~6 months
default first before submitting long jobs.default queue has the most cores available — good for parallel work.qstat -qWalltime is the maximum real-world time your job is allowed to run. When it runs out, your job is killed — even if it's not finished.
#PBS -l walltime=HH:MM:SS
# Examples:
#PBS -l walltime=00:30:00 # 30 minutes
#PBS -l walltime=02:00:00 # 2 hours
#PBS -l walltime=24:00:00 # 1 day
#PBS -l walltime=72:00:00 # 3 days (short queue max)
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=02:00:00
cd $PBS_O_WORKDIR
echo "Start: $(date)"
# your work here
echo "End: $(date)"
| Situation | What happens |
|---|---|
| No walltime set | Scheduler uses the queue's default limit — you have no control |
| Walltime too short | Job gets killed before finishing — you lose all progress |
| Walltime too long | Job waits longer in the queue — others are prioritised |
| Walltime just right | Job runs, finishes, output is saved ✅ |
When your job runs, it produces two types of output:
| Type | What it is | Example |
|---|---|---|
| stdout (standard output) | Normal output from your program — results, progress messages, print statements | print("Hello"), echo "Done" |
| stderr (standard error) | Warning messages, errors, or diagnostic info — things that went wrong or need attention | FileNotFoundError, Warning: low memory |
By default, PBS saves both to the folder where you ran qsub. You can control whether they go to one file or two.
If you don't specify anything, PBS creates a file named <jobname>.o<jobid> containing both stdout and stderr merged together.
# You submit:
qsub myscript.sh
# → Job ID: 465118.iisermhpc1
# → Output file created: MyJob.o465118
Use -j oe to merge stdout and stderr into a single file. Add -o to give it a clean, readable name.
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -j oe # join stdout + stderr
#PBS -o MyJob.log # name the output file
cd $PBS_O_WORKDIR
echo "everything goes into MyJob.log"
Leave out -j oe and specify both -o and -e separately. Useful if you want to check errors without digging through all the output.
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -o MyJob_out.log # stdout only
#PBS -e MyJob_err.log # stderr only
cd $PBS_O_WORKDIR
echo "this goes to MyJob_out.log"
echo "this is an error" >&2 # goes to MyJob_err.log
| What you write | Files created |
|---|---|
| Nothing | MyJob.o465118 — auto-named, merged |
-j oe -o job.log | job.log — one clean file, your name |
-o out.log -e err.log | Two files — stdout and stderr separated |
While a job is running you can follow its output live:
tail -f MyJob.log
Press Ctrl + C to stop following.
Requesting CPUs tells the scheduler how many cores to give your job. There are two valid syntaxes on IISERM HPC — both work, they mean the same thing.
# Old style — Torque syntax
#PBS -l nodes=1:ppn=4
# New style — PBS Pro syntax
#PBS -l select=1:ncpus=4
Old style ppn | New style ncpus | |
|---|---|---|
| Full form | Processors Per Node | Number of CPUs |
| Used with | nodes=1:ppn=4 | select=1:ncpus=4 |
| Works on IISERM? | ✅ Yes | ✅ Yes |
| Meaning | Exactly the same — CPUs per node | |
# Single core — default, simplest
#PBS -l nodes=1:ppn=1
# 4 cores on one node
#PBS -l nodes=1:ppn=4
# 16 cores on one node
#PBS -l nodes=1:ppn=16
# Same using new syntax
#PBS -l select=1:ncpus=16
Inside your running job you can check:
nproc # cores assigned to this job
cat $PBS_NODEFILE # lists the node(s) assigned
ppn=16 but your code only uses 1 core, the other 15 cores sit idle — wasting shared resources and making your job wait longer in the queue.ppn=1. Only increase it if:
mpi4py, multiprocessing, OpenMP, or numba.prallelppn=1. You can always request more later once you know your code benefits from it.
Software on the cluster is not loaded by default. You use the module system to load what you need inside your job script.
On the login node, type module load then press Tab Tab. When prompted, press y:
module load # press Tab Tab, then y
Display all 131 possibilities? (y or n) y
anaconda3 codes/gromacs/2023 tools/root/6.28
codes/anaconda3/23.3.1 codes/orca/5.0.4 codes/python/3.11.4
codes/cp2k/8.1.0 codes/R/4.3.1 compilers/gnu/8.3.0
codes/gaussian16/G16 tools/deeptools/3.5.1 compilers/openmpi/4.1.1
codes/geant4/11.1 tools/espresso/7.0 libs/fftw/3.3.10
... and more
Then load whatever you need by typing its exact name:
module load anaconda3
module load codes/python/3.11.4
module load codes/R/4.3.1
module load codes/gromacs/2023
module load tools/root/6.28
module list # what is currently loaded
module show anaconda3 # preview what a module does
module unload anaconda3 # unload one module
module purge # unload everything
#!/bin/bash
#PBS -N PythonJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -j oe
#PBS -o PythonJob.log
cd $PBS_O_WORKDIR
module load anaconda3
echo "Start: $(date)"
python -u script.py
echo "End : $(date)"
module load after cd $PBS_O_WORKDIR and before your actual commands.
If your packages are inside a conda environment, activate it like this:
#!/bin/bash
#PBS -N CondaJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -j oe
#PBS -o CondaJob.log
cd $PBS_O_WORKDIR
source ~/.bashrc
conda activate myenv
echo "Start: $(date)"
python -u script.py
echo "End : $(date)"
source ~/.bashrc is required first so the conda command is available inside the PBS job environment.
Copy any of these, change the job name and your command, and submit.
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -j oe
#PBS -o MyJob.log
cd $PBS_O_WORKDIR
echo "Start: $(date)"
# your command here
echo "End: $(date)"
#!/bin/bash
#PBS -N PythonJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=02:00:00
#PBS -j oe
#PBS -o PythonJob.log
cd $PBS_O_WORKDIR
module load anaconda3
echo "Start: $(date)"
python -u script.py
echo "End : $(date)"
#!/bin/bash
#PBS -N InlineJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -j oe
#PBS -o InlineJob.log
cd $PBS_O_WORKDIR
module load anaconda3
echo "Start: $(date)"
python -u << 'PYEOF'
for i in range(1, 6):
print(f"Count: {i}")
print("Done!")
PYEOF
echo "End: $(date)"
-u flag makes Python output appear immediately in the log without buffering.
#!/bin/bash
#PBS -N ParallelJob
#PBS -q short
#PBS -l nodes=1:ppn=8
#PBS -l walltime=12:00:00
#PBS -j oe
#PBS -o ParallelJob.log
cd $PBS_O_WORKDIR
module load anaconda3
echo "Cores: $(nproc)"
echo "Start: $(date)"
python -u parallel.py --cores 8
echo "End : $(date)"
#!/bin/bash
#PBS -N MyJob
#PBS -q default
#PBS -l nodes=1:ppn=1
#PBS -l walltime=01:00:00
#PBS -o MyJob_out.log
#PBS -e MyJob_err.log
cd $PBS_O_WORKDIR
module load anaconda3
python -u script.py
How to submit your script, check its status, and cancel it if needed.
qsub script.sh
# → 465118.iisermhpc1 (this is your job ID)
qstat # all jobs in the queue
qstat -u ms21080 # only your jobs
qstat -f 465118.iisermhpc1 # full details of one job
Example output from qstat -u ms21080:
Job ID Name User Time S Queue
--------------------- ---------- --------- ----- - -------
465118.iisermhpc1 count_job ms21080 00:05 R default
| Code | Meaning | What to do |
|---|---|---|
Q | Queued | Waiting for a free node — just wait |
R | Running | Job is running — check log with tail -f |
H | Held | Paused — use qrls to release |
E | Exiting | Finishing up, output being written |
C | Completed | Done — check your log file |
qdel 465118.iisermhpc1 # cancel a specific job
.iisermhpc1 — just the number alone may not work.
qhold 465118.iisermhpc1 # pause a queued job
qrls 465118.iisermhpc1 # release a held job
qstat -q # see all queues and their limits
pbsnodes -a # see all compute nodes and status
tail -f MyJob.log # watch log output live
watch -n 5 qstat -u ms21080 # refresh job list every 5 seconds
| Variable | What it contains |
|---|---|
$PBS_JOBID | Your job's full ID e.g. 465118.iisermhpc1 |
$PBS_JOBNAME | Job name you set with -N |
$PBS_O_WORKDIR | Directory where you ran qsub |
$PBS_NODEFILE | Lists nodes assigned to your job |
$PBS_QUEUE | Queue the job is running in |
On our cluster, some nodes may appear "up" but are actually down or overloaded. To save time and avoid jobs stuck in queue, you can add a helper function to your ~/.bashrc that shows free cores per node in any queue.
checkfree() Function to ~/.bashrcOpen your bash configuration file:
nano ~/.bashrc
Scroll to the bottom and paste this function:
Save with Ctrl+O → Enter, then exit with Ctrl+X.
Apply the changes by sourcing the file:
source ~/.bashrc
Now the checkfree command is available in your terminal.
Run checkfree followed by any queue name:
checkfree default
checkfree short
checkfree long
checkfree infinity
checkfree gpushort
checkfree gpulong
Example output:
If you see a node with many free cores (e.g., gpc11 has 52 free), you can target it directly in your PBS script:
#PBS -l nodes=gpc11:ppn=4 # request 4 cores on node gpc11 only
This can help your job start faster if that node is lightly loaded.
checkfree default (or your target queue)#PBS -l nodes=gpc11:ppn=4qsub script.shqstat -u $USER → look for status Rtail -f MyJob.logThis simple habit saves hours of waiting for jobs stuck on "dead" nodes.