This is the snap for the Slurm Workload Manager, "The Slurm Workload Manager (formerly known as Simple Linux Utility for Resource Management or SLURM), or Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters."
Slurm is available to download from the Snap Store. All Snaps installed from the Snap Store receive automatic updates via Snapd and are automatically aliased.
snap install slurm --classic
The Snap Store has multiple channels for different release candidates (edge, beta, stable, etc).
The Slurm Snap is also released nightly to Github Releases.
Keep in mind that if you install the Slurm Snap from a Github Release, you will not recieve automatic updates or automatic Snap aliasing.
This snap supports running different components of slurm depending on what snap.mode
has been configured.
The following snap.mode
values are supported:
none
all
login
munged
slurmdbd
slurmdbd+mysql
slurmd
slurmrestd
To configure this snap to run a different set of daemons, just set the snap.mode
:
snap set slurm snap.mode=all
The above command configures the snap.mode
to all
mode. This runs all of the Slurm daemons including MySQL and Munged in an all-in-one local development mode.
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug* up infinite 1 idle slurm-dev
$ scontrol ping
Slurmctld(primary) at slurm-dev is UP
$ srun -pdebug -n1 -l hostname
0: slurm-dev
The following example will only work with the classic
Snap:
$ srun --uid 1000 -N1 -l uname -r
0: 5.4.0-31-generic
All services log their stdout to journald. The logs can be accessed snap logs
, example:
$ snap logs slurm.slurmrestd
or by using using the journal directly:
$ journalctl -eu snap.slurm.slurmrestd
Certain services also write to log files which is only readble by root for security purposes. The following services write to log files:
- nhc
- slurmd
- slurmdbd
- slurmctld
Log files are found at /var/snap/slurm/common/var/log/
. For example, the log for slurmctld can be found at:
/var/snap/slurm/common/var/log/slurm/slurmctld.log
Configuration files can be found in under /var/snap/slurm/common/var/etc
.
For testing purposes, you can manually edit the .conf
files located under /var/snap/slurm/common/etc/
. However, any changes you make to slurm.conf
or slurmdbd.conf
will be overwritten when the snap.mode
is changed.
Persistent changes to the Slurm configuration files are made using the .yaml
files located under /var/snap/slurm/common/etc/slurm-configurator
. For example, if you wanted to change the port slurmd runs on, you would edit the slurm.yaml
file here:
/var/snap/slurm/common/etc/slurm-configurator/slurm.yaml
To apply any configuration changes to the above file, you need to restart the slurm daemons that run inside the snap. Assuming the snap.mode=all
, run the following command:
snap set slurm snap.mode=all
This will render the slurm.yaml -> slurm.conf and restart the appropriate daemons.
To modify the Node Healthcheck configuration, edit the file located here:
/var/snap/slurm/common/etc/nhc/nhc.conf
NHC is run automatically by Slurmd and changes to nhc.conf
take effect immediately.
When configuring Slurm to run as part of a large-scale compute cluster, remember to adjust the system configuration files according. More information about this can be found here.
You can interact with individual services using snap services
. Example:
$ snap services slurm
Service Startup Current Notes
slurm.munged enabled active -
slurm.mysql enabled active -
slurm.slurmctld enabled active -
slurm.slurmd enabled active -
slurm.slurmdbd enabled active -
slurm.slurmrestd enabled active -
The following commands are available from the snap:
munge
remunge
sacct
sacctmgr
salloc
sattach
sbatch
sbcast
scancel
scontrol
sdiag
sinfo
sprio
squeue
sreport
srun
sshare
sstat
strigger
version
If you are using the Slurm Snap installed from the Github Release, all commands must be namespaced with slurm.
. Example:
$ slurm.srun -p debug -n 1 uname -a
- OmniVector Solutions [email protected]