2. Slurm Deployment

2.1. Prerequisites

If AuthType=auth/munge is set, munge is needed for authentication. See “munge”.

Since version 23.11, Slurm has its own authentication plugin. The option is AuthType=auth/slurm.

Install packages on each node according their roles. See “Install Slurm Packages”.

2.2. Configure

2.2.1. Create slurm user & group

$ sudo useradd -Mlrc "SLURM workload manager" -d /nonexistent -s /usr/sbin/nologin slurm

Find the IDs of group slurm and user slurm:

$ getent group | grep slurm
slurm:x:998:
$ getent passwd | grep slurm
slurm:x:998:998:SLURM workload manager:/nonexistent:/usr/sbin/nologin

Add group & user slurm on other hosts (the IDs must be the same either using auth/munge or using auth/slurm):

$ sudo groupadd -g998 slurm
$ sudo useradd -Mlrc "SLURM workload manager" -g slurm -u998 slurm

2.2.2. Create configuration files

Create file /etc/slurm/slurm.conf:

ClusterName=las
SlurmctldHost=las0
SlurmctldParameters=enable_configless
MaxNodeCount=100
ProctrackType=proctrack/cgroup
ReturnToService=1
AuthType=auth/slurm
CredType=auth/slurm
AuthAltTypes=auth/jwt
AuthAltParameters=jwt_key=/etc/slurm/jwt_hs256.key
#
StateSaveLocation=/var/spool/slurm/ctld
SlurmdSpoolDir=/var/spool/slurm/d-%h
SlurmctldLogFile=/var/log/slurm/ctld.log
SlurmdLogFile=/var/log/slurm/d-%h.log
SlurmUser=slurm
TaskPlugin=task/affinity,task/cgroup
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_tres
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=localhost
AccountingStorageTRES=gres/gpu
JobAcctGatherType=jobacct_gather/cgroup
#
# COMPUTE NODES
PartitionName=DEFAULT MaxTime=INFINITE State=UP
PartitionName=normal Nodes=ALL Default=YES PriorityTier=1
PartitionName=high Nodes=ALL Default=NO PriorityTier=2
#
# GRES
GresTypes=gpu
NodeName=DEFAULT Gres=gpu:_:1

Note

  • MaxNodeCount must be set for configless mode

  • SlurmdUser (not SlurmUser) should be set to root (default), or slurmd cannot run properly

  • If log file paths are not set, slurm will write logs to syslog

  • %h in SlurmdPidFile, SlurmdSpoolDir and SlurmdLogFile is useful when these paths located in a storage shared by all computing node. Using %n is worse for it stand for “Node Name”, and is not known at the start of slurmd

  • The type of GPU must be a substring of the real GPU type, and empty string is not looked on as substring of any other strings (maybe a bug). So a _ will match most of GPU types

Create file /etc/slurm/cgroup.conf:

CgroupPlugin=autodetect

If there is GresTypes configured (e.g. for GPU), you can create file /etc/slurm/gres.conf:

AutoDetect=nvml

To check if Gres can be dectected, you can do this on the computing node:

$ slurmd -C
NodeName=las3 CPUs=8 Boards=1 SocketsPerBoard=8 CoresPerSocket=1 ThreadsPerCore=1 RealMemory=7935 Gres=gpu:nvidia_l40s:1
Found gpu:nvidia_l40s:1 with Autodetect=nvml (Substring of gpu name may be used instead)
UpTime=1-02:27:56

If your slurmd doesn’t support nvml, Autodetect=nvidia will be used.

Do not forget to change the owners of the configuration files to slurm:

$ sudo chown slurm:slurm /etc/slurm/*.conf

2.2.3. Create key for authentication

$ sudo dd if=/dev/random of=/etc/slurm/slurm.key bs=1024 count=1
1+0 records in
1+0 records out
1024 bytes (1.0 kB, 1.0 KiB) copied, 8.8143e-05 s, 11.6 MB/s
$ sudo chown slurm:slurm /etc/slurm/slurm.key
$ sudo chmod 600 /etc/slurm/slurm.key

The key must be distributed to every nodes in the cluster.

2.2.4. Create directories

$ sudo mkdir -p /var/spool/slurm && sudo chown slurm:slurm /var/spool/slurm
$ sudo mkdir -p /var/log/slurm && sudo chown slurm:slurm /var/log/slurm

2.2.5. Configure slurmd

Because configless mode is enabled, slurm.conf is not needed on a computing node. In this case, slurmd must be started with -Z --conf-server. You can do this by:

$ sudo systemctl edit --full slurmd
 EnvironmentFile=-/etc/default/slurmd
 RuntimeDirectory=slurm
 RuntimeDirectoryMode=0755
-ExecStart=/usr/sbin/slurmd --systemd $SLURMD_OPTIONS
+ExecStart=/usr/sbin/slurmd --systemd $SLURMD_OPTIONS -Z --conf-server=las0:6817
 ExecReload=/bin/kill -HUP $MAINPID
 KillMode=process
 LimitNOFILE=131072

2.2.6. Configure slurmdbd

slurmdbd can store data in MySQL or MariaDb. If MariaDb is used, create a file /etc/msyql/conf.d/slurmdb.cnf to set the recommended parameters:

[mariadb]
  innodb_buffer_pool_size=4096M
  innodb_log_file_size=64M
  innodb_lock_wait_timeout=900
  max_allowed_packet=16M

Create user and grant previleges for slurmdbd in the mysql/mariadb database:

CREATE USER slurmdbd IDENTIFIED BY 'slurmdbd-password';
GRANT ALL on `slurm_acct_db`.* TO `slurmdbd`@`%`;

Create slurmdbd configuration file /etc/slurm/slurmdbd.conf:

DbdHost=las0
AuthType=auth/slurm
SlurmUser=slurm
PidFile=/var/run/slurmdbd/dbd.pid
LogFile=/var/log/slurm/dbd.log
StorageType=accounting_storage/mysql
StorageHost=las0
StorageUser=slurmdbd
StoragePass=slurmdbd-password

Note

If SlurmUser is not set, slurmdbd will try to act as root.

Set owner and modes of the configuration file:

$ sudo chown slurm:slurm /etc/slurm/slurmdbd.conf
$ sudo chmod 0600 /etc/slurm/slurmdbd.conf

Important

For the password of database is set in this file, no others should read the file except the owner.

2.3. JWT and REST API

Create JWT key:

$ sudo dd if=/dev/random of=/etc/slurm/jwt_hs256.key bs=32 count=1
1+0 records in
1+0 records out
32 bytes copied, 7.6925e-05 s, 416 kB/s
$ sudo chown slurm:slurm /etc/slurm/jwt_hs256.key
$ sudo chmod 0600 /etc/slurm/jwt_hs256.key

Create user/group for slurmrestd:

$ sudo useradd -Mlrc "SLURM REST API Server" -d /nonexistent -s /usr/sbin/nologin slurmrestd
$ getent group | grep slurmrestd
slurmrestd:x:997:
$ getent passwd | grep slurmrestd
slurmrestd:x:997:997:SLURM REST API Server:/nonexistent:/usr/sbin/nologin

Install slurmrestd first, then configure the service:

$ sudo systemctl edit --full slurmrestd
 # Please either use the -u and -g options in /etc/sysconfig/slurmrestd or
 # /etc/default/slurmrestd, or explicitly set the User and Group in this file
 # an unpriviledged user to run as.
-# User=
-# Group=
+User=slurmrestd
+Group=slurmrestd
 ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS
 # Enable auth/jwt be default, comment out the line to disable it for slurmrestd
 Environment=SLURM_JWT=daemon
 # Listen on TCP socket by default.
-Environment=SLURMRESTD_LISTEN=:6820
+Environment=SLURMRESTD_LISTEN=0.0.0.0:6820
 ExecReload=/bin/kill -HUP $MAINPID
 
 [Install]

Note

Service slurmrestd cannot be run as root or SlurmUser.

2.4. Run

Start services in the following order:

  • slurmdbd if it is configured

  • slurmctld on the controller node

  • slurmd on the worker nodes

  • slurmrestd if it is needed

Start them foreground if in containers, for example:

$ sudo slurmctld -D

Check the version:

$ sinfo -V
slurm 24.11.5

Check sluster status:

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
normal*      up   infinite      4   idle las[0-3]
high         up   infinite      4   idle las[0-3]