snodes is a useful command for determining what resources are installed in the clusters and also what is currently available to run jobs on. The snodes command shows every node, what state it is in (idle, allocated, drained/offline), how many CPUs it has, which CPUs are allocated, the current load on the machine, how much RAM it has, GPU info (where applicable), what partition it's in, and the Slurm features associated with it.
[ccruser@vortex1:~]$ snodes --help
==============================================================
Display information about one or more nodes, possibly filtered
by partition and/or state.
If no node arg or 'all' is provided, all nodes will be
summarized. Similar behavior exists for the partition and
state(s) args
Usage: snodes [node1,node2,etc.] [cluster/partition] [state(s)]
==============================================================
See everything in the default cluster:
[ccruser@vortex1:~]$ snodes
HOSTNAMES STATE CPUS S:C:T CPUS(A/I/O/T) CPU_LOAD MEMORY GRES PARTITION AVAIL_FEATURES
cpn-d07-04-01 idle 8 2:4:1 0/8/0/8 0.01 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
cpn-d07-04-02 idle 8 2:4:1 0/8/0/8 0.05 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
cpn-d07-05-01 idle 8 2:4:1 0/8/0/8 0.03 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
cpn-d07-05-02 idle 8 2:4:1 0/8/0/8 0.01 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
cpn-d07-06-01 drain 8 2:4:1 0/0/8/8 0.01 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
cpn-d07-06-02 idle 8 2:4:1 0/8/0/8 0.01 23000 (null) general-compute* UBHPC,CPU-L5630,INTEL
...
See everything in a specific partition:
[ccruser@vortex1:~]$ snodes all ub-hpc/debug
HOSTNAMES STATE CPUS S:C:T CPUS(A/I/O/T) CPU_LOAD MEMORY GRES PARTITION AVAIL_FEATURES
cpn-k05-22 idle 16 2:8:1 0/16/0/16 0.02 128000 (null) debug CPU-E5-2660,INTEL
cpn-k05-26 idle 16 2:8:1 0/16/0/16 0.01 128000 (null) debug UBHPC,CPU-E5-2660,INTEL
cpn-k08-34-01 idle 12 2:6:1 0/12/0/12 0.15 48000 (null) debug UBHPC,IB,CPU-E5645,INTEL
cpn-k08-34-02 idle 12 2:6:1 0/12/0/12 0.02 48000 (null) debug UBHPC,IB,CPU-E5645,INTEL
cpn-k08-40-01 idle 12 2:6:1 0/12/0/12 0.08 48000 (null) debug UBHPC,IB,CPU-E5645,INTEL
cpn-k08-41-01 idle 12 2:6:1 0/12/0/12 0.04 48000 (null) debug UBHPC,IB,CPU-E5645,INTEL
cpn-k08-41-02 idle 12 2:6:1 0/12/0/12 0.02 48000 (null) debug UBHPC,IB,CPU-E5645,INTEL
cpn-u28-38 mix 32 2:16:1 28/4/0/32 14.07 187000 gpu:tesla_v1 debug UBHPC,CPU-Gold-6130,INTE
Search for a specific Slurm feature (in this example, Infiniband (IB) in the academic cluster):
[ccruser@vortex1:~]$ snodes all ub-hpc/general-compute |grep IB
cpn-f16-03 alloc 16 2:8:1 16/0/0/16 0.02 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-04 alloc 16 2:8:1 16/0/0/16 1.15 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-05 alloc 16 2:8:1 16/0/0/16 0.37 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-06 alloc 16 2:8:1 16/0/0/16 0.97 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-07 alloc 16 2:8:1 16/0/0/16 1.09 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-08 alloc 16 2:8:1 16/0/0/16 15.98 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
cpn-f16-09 alloc 16 2:8:1 16/0/0/16 1.11 128000 (null) general-compute* UBHPC,IB,CPU-E5-2660,INTEL
...
See everything in a partition in another cluster that's not the default:
[ccruser@vortex1:~]$ snodes all faculty/scavenger
HOSTNAMES STATE CPUS S:C:T CPUS(A/I/O/T) CPU_LOAD MEMORY GRES PARTITION AVAIL_FEATURES
cpn-d11-01 alloc 8 2:4:1 8/0/0/8 0.39 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-02 alloc 8 2:4:1 8/0/0/8 0.41 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-03 alloc 8 2:4:1 8/0/0/8 0.01 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-04 alloc 8 2:4:1 8/0/0/8 0.46 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-05 alloc 8 2:4:1 8/0/0/8 0.14 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-07 alloc 8 2:4:1 8/0/0/8 0.97 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-08 alloc 8 2:4:1 8/0/0/8 0.69 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-09 alloc 8 2:4:1 8/0/0/8 0.04 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-10 mix 8 2:4:1 1/7/0/8 1.19 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-11 mix 8 2:4:1 1/7/0/8 1.24 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-12 alloc 8 2:4:1 8/0/0/8 0.04 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d11-13 alloc 8 2:4:1 8/0/0/8 3.99 48000 (null) scavenger FACULTY,CPU-E5520,INTEL
cpn-d11-18 alloc 8 2:4:1 8/0/0/8 8.01 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d12-02 mix 8 2:4:1 1/7/0/8 1.03 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d12-03 alloc 8 2:4:1 8/0/0/8 0.27 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
...
Show all idle nodes in a specified cluster and partition:
[ccruser@vortex1:~]$ snodes all faculty/scavenger |grep idle
cpn-d12-09 idle 8 2:4:1 0/8/0/8 0.19 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d12-11 idle 8 2:4:1 0/8/0/8 0.02 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-d12-12 idle 8 2:4:1 0/8/0/8 0.04 23000 (null) scavenger FACULTY,CPU-L5520,INTEL
cpn-f11-05 idle 24 2:12:1 0/24/0/24 0.04 256000 (null) scavenger FACULTY,CPU-E5-2650v4,INTEL
cpn-f11-06 idle 24 2:12:1 0/24/0/24 0.03 256000 (null) scavenger FACULTY,CPU-E5-2650v4,INTEL
cpn-f11-07 idle 24 2:12:1 0/24/0/24 0.03 256000 (null) scavenger FACULTY,CPU-E5-2650v4,INTEL
cpn-f11-08 idle 24 2:12:1 0/24/0/24 0.01 256000 (null) scavenger FACULTY,CPU-E5-2650v4,INTEL
cpn-f11-09 idle 24 2:12:1 0/24/0/24 0.01 256000 (null) scavenger FACULTY,CPU-E5-2650v4,INTEL
...
Show all nodes in the industry partition:
[ccruser@vortex1:~]$ snodes all ub-hpc/industry
HOSTNAMES STATE CPUS S:C:T CPUS(A/I/O/T) CPU_LOAD MEMORY GRES PARTITION AVAIL_FEATURES
cpn-h22-04 alloc 56 2:28:1 56/0/0/56 56.17 1000000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-05 mix 56 2:28:1 48/8/0/56 4.01 1000000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-06 alloc 56 2:28:1 56/0/0/56 56.03 1000000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-07 alloc 56 2:28:1 56/0/0/56 56.03 1000000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-09 mix 56 2:28:1 48/8/0/56 4.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-10 mix 56 2:28:1 24/32/0/56 2.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-11 alloc 56 2:28:1 56/0/0/56 56.02 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-12 mix 56 2:28:1 49/7/0/56 49.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-13 mix 56 2:28:1 49/7/0/56 49.08 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-14 alloc 56 2:28:1 56/0/0/56 56.32 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-15 alloc 56 2:28:1 56/0/0/56 56.55 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-16 alloc 56 2:28:1 56/0/0/56 56.28 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-17 alloc 56 2:28:1 56/0/0/56 56.29 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-18 plnd 56 2:28:1 0/56/0/56 0.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-19 mix 56 2:28:1 12/44/0/56 1.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-20 alloc 56 2:28:1 56/0/0/56 56.23 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-24 mix 56 2:28:1 49/7/0/56 49.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-25 alloc 56 2:28:1 56/0/0/56 56.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-26 plnd 56 2:28:1 0/56/0/56 1.25 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-27 mix 56 2:28:1 48/8/0/56 4.01 512000 (null) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB
cpn-h22-29 mix 56 2:28:1 48/8/0/56 4.01 512000 gpu:a100-pcie-40gb:2(S:0-1) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB,A100
cpn-h22-31 alloc 56 2:28:1 56/0/0/56 56.01 512000 gpu:a100-pcie-40gb:2(S:0-1) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB,A100
cpn-h22-33 mix 56 2:28:1 48/8/0/56 4.01 512000 gpu:a100-pcie-40gb:2(S:0-1) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB,A100
cpn-h22-35 alloc 56 2:28:1 56/0/0/56 56.12 512000 gpu:a100-pcie-40gb:2(S:0-1) industry* INDUSTRY,CPU-Gold-6330,INTEL,h22,IB,A100
The format of the Slurm features in snodes command output is:
CLUSTER, CPU_MODEL, CPU_MANUFACTURER, RACK,[FUNDING SOURCE, INTERCONNECT, GPU_MODEL]
Anything in [ ] is optional and may be dependent on what hardware is in the node
Using this node as an example, here are more specifics about the different snodes columns:
HOSTNAMES STATE CPUS S:C:T CPUS(A/I/O/T) CPU_LOAD MEMORY GRES PARTITION A
VAIL_FEATURES