(Edit: caveats are listed in a more recent post
Computer cluster job scheduling software is fantastic at managing resources
and permitting many jobs to run efficiently and simultaneously.
However, schedulers aren’t always great at giving end-users feedback
when things go wrong.
For example, on our ShARC cluster, which runs the (Son of) Grid Engine (SGE) scheduler,
if you request a longer run-time than is permitted by any of the cluster’s job queue configurations
then your job will sit there queueing indefinitely until you or someone else deletes it.
For example, let’s use qsub
to submit a job where we ask for 1000 hours of run time and 4 GiB of RAM:
[will@mysofa ~]$ ssh sharc
...
[te1st@sharc-login1 ~]$ qsub -l h_rt=1000:00:00 -l rmem=4G -m bea -M w.furnass@sheffield.ac.uk -N longtask myjobscript.sge
Your job 236268 ("STDIN") has been submitted
[te1st@sharc-login1 ~]$ qstat -u $USER
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
217834 0.00000 longtask te1st qw 03/20/2017 10:48:39 1
Job 217834 will now sit queuing forever.
Not only will you not be told why,
you won’t be given any notification that the job will not run.
In situations like this it can be useful to ask the scheduler to validate a job.
One way of doing this is to run ‘qalter -w v <myjobid>
’ after job submission
if say you think that a job has now been queueing for longer
than previously-submitted jobs of a similar nature:
[te1st@sharc-login1 ~]$ qalter -w v 217834
Job 217834 (-l h_rt=3600000) cannot run in queue "flybrain.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "gpu.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "gen2reg.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "rse.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "gpu-vis.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "insigneo-polaris.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "interactive.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "shortint.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "all.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "evolgen.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "rse-training.q" because of cluster queue
Job 217834 (-l h_rt=3600000) cannot run in queue "cstest.q" because of cluster queue
verification: no suitable queues
What this ‘qalter -w v <myjobid>
’ command does is check to see whether the job could run
in any of the job queues on the cluster
if the cluster were free of other jobs.
The last line of output is key:
our job will never be run given the current cluster configuration.
Looking above that, we can see that it cannot run in any of the general-purpose job queues
(such as all.q
) and
there is specific mention of our 1000 hour (3600000s) run-time resource request.
We can therefore deduce that our run-time resource request wasn’t satisfiable.
Once we know that our job can’t run we could then delete our job…
[te1st@sharc-login1 ~]$ qdel 217834
te1st has deleted job 217834
…then consult the cluster’s documentation to discover the maximum possible run-time and resubmit using more sensible resource requests.
Alternatively we can use qalter to modify the resource requests associated with a queueing job:
qalter -l h_rt=96:00:00 -l rmem=4G 217834
Important: using qalter
in this fashion will change all resource requests for the job so here we need to re-specify the rmem
request.
You can also perform the same type of job validation at job submission time using -w v
e.g.
qsub -w v -l 1000:00:00 -l rmem=4G myjobscript.sge
This won’t actually submit your job; it just performs validation.
You may ask why such validation is not enabled by default for all jobs; one reason for this is that it is believed it would place undue burden on the scheduler.
Another is that sometimes a validation attempt results in a false negative that can be difficult to automatically identify (edit: see this more recent post for details).
If you repeat the experiment outlined above but instead of requesting 1000 hours of runtime you ask for 100 GPUs, 9999GB of RAM or 10000 cores you’ll observe the same behaviour: jobs that make requests unsatisfiable under the current cluster configuration can be submitted but will never run.
Again, job validation can help here but depending on the type of resource the validation error messages can be more or less cryptic.
For example, if you try to validate a 100000-‘slot’ (core) MPI job using -w v
you get the following:
qsub -pe mpi 100000 -w v somejob.sge
...
Job 311838 cannot run in PE "mpi" because it only offers 0 slots
This is rather misleading but the mention of ‘slots’ should prompt you to check the number of cores you’ve requested is sensible.
Another type of validation is poke validation,
which checks if a job could be run under the current cluster load
i.e. with many of the cluster’s resources already in use.
See man qsub
and search for -w
for more information on the different types of validation.
For queries relating to collaborating with the RSE team on projects: rse@sheffield.ac.uk
Information and access to JADE II and Bede.
Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.
Queries regarding free research computing support/guidance should be raised via our Code clinic or directed to the University IT helpdesk.