Partitions (queues) and services¶
SLURM differs slightly from the previous Torque system with respect to
definitions of various parameters, and what was known as queues in Torque may
be covered by both
We have the following partitions:
- For testing and debugging. Up to 1 hour of walltime.
- The default partition. Up to 48 hours of walltime.
- If you ask for less resources than available on one single node, this will be the partition your job will be put in. We may remove the single-user policy on this partition in the future. This partition is also for single-node jobs that run for longer than 48 hours.
- Request this partition if you ask for more resources than you will find on one node and request walltime longer than 48 hrs.
To figure out the walltime limits for the various partitions, type:
$ sinfo --format="%P %l"
As a service to users that needs to submit short jobs for testing and debugging, we have a service called devel. These jobs have higher priority, with a maximum of 4 hrs of walltime and no option for prolonging runtime.
Jobs in using devel service will get higher priority than any other jobs in the system and will thus have a shorter queue delay than regular jobs. To prevent misuse the devel service has the following limitations:
- Only one running job per user.
- Maximum 4 hours walltime.
- Only one job queued at any time, remark this is for the whole queue.
You submit to the devel-service by typing:
in your job script.
General job limitations¶
The following limits are the default per user in the batch system. Users can ask for increased limits by sending a request to email@example.com.
|Max number of running jobs||1024|
|Maximum cpus per job||2048|
|Maximum walltime||28 days|
|Maximum memory per job||No limit:sup:1|
1 There is a practical limit of 128GB per compute node used.
Remark: Even if we impose a 28 day run time limit on Stallo we only give a weeks warning on system maintenance. Jobs with more than 7 days walltime, will be terminated and restarted if possible.
See About Stallo chapter of the documentation if you need more information on the system architecture.