Skip to main content

SLURM tutorial : Basic commands

Main website for learning SLRUM

Submit a job with name and outputfile name(This will overwrite the parameters in shell file header )

sbatch   -J   job1  -o   job1.out  --partition=batch


Basic shell script for job

#SBATCH --job-name=testJob
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=dragon-default
# Display all variables set by slurm
env | grep "^SLURM" | sort

cd /projects/dragon/FANTOM5/processed_data_feature

## All my commands for job will go here

mkdir t1

How to submit a batch job


How to check the list of jobs of a user

squeue -u user1
squeue -u user1 -l  # it will show in details

How to check the whole history and status of a job

 scontrol show job=JOBID


How to use one particular node in interactive mode. Useful when all jobs are pending and you need to run a job

srun --pty --time=5:00:00 bash

How to kill job

  •  Cancel job 1234 along with all of its steps:
    •               scancel 1234
  •  Send SIGKILL to all steps of job 1235, but do not cancel the job itself:
    •               scancel --signal=KILL 1235
  •  Send SIGUSR1 to the batch shell processes of job 1236:
    •               scancel --signal=USR1 --batch 1236
  •  Cancel job all pending jobs belonging to user "bob" in partition "debug":
    •               scancel --state=PENDING --user=bob --partition=debug
  •  Cancel only array ID 4 of job array 1237
    •               scancel 1237_4

How to start a node in interactive mode

srun --pty --nodes=1 --exclusive --partition=interactive bash -l

How to start a  GUI in cluster

You need xserver to access the GUI

1. Login to cluster

 ssh -Y

Next, you'll need to get an interactive jobs started, for example to get a whole node in the interactive queue:

2.  Open an interactive node

srun --pty --nodes=1 --exclusive --partition=interactive bash -l
3. Suppose you want to use matlab in GUI, then do following two commands

module load matlab
matlab &

Full list of SLURM commands [ Source: ]

Man pages exist for all SLURM daemons, commands, and API functions. The command option --help also provides a brief summary of options. Note that the command options are all case insensitive.
  • sacct is used to report job or job step accounting information about active or completed jobs.
  • salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.
  • sattach is used to attach standard input, output, and error plus signal capabilities to a currently running job or job step. One can attach to and detach from jobs multiple times.
  • sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
  • sbcast is used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
  • scancel is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
  • scontrol is the administrative tool used to view and/or modify SLURM state. Note that many scontrol commands can only be executed as user root.
  • sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
  • smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
  • squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
  • srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
  • smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
  • strigger is used to set, get or view event triggers. Event triggers include things such as nodes going down or jobs approaching their time limit.
  • sview is a graphical user interface to get and update state information for jobs, partitions, and nodes managed by SLURM.

 Handling multiple jobs by Job Array:


  1. Hi, I really loved reading this article. By this article i have learnt many things about OBIEE QAs, please keep me updating if there is any update.
    Teradata Online Training
    Teradata Training
    Teradata Online Course keep updating.........


Post a Comment

Popular posts from this blog

R tutorial

Install R in linux ============ In CRAN home page, the latest version is not available. So, in fedora, Open the terminal yum list R  --> To check the latest available version of r yum install R --> install R version yum update R --> update current version to latest one 0 find help ============ ?exact topic name (  i.e.   ?mean ) 0.0 INSTALL 3rd party package  ==================== install.packages('mvtnorm' , dependencies = TRUE , lib='/home/alamt/myRlibrary/')   #  install new package BED file parsing (Always use read.delim it is the best) library(MASS) #library(ggplot2) dirRoot="D:/research/F5shortRNA/TestRIKEN/Rscripts/" dirData="D:/research/F5shortRNA/TestRIKEN/" setwd(dirRoot) getwd() myBed="test.bed" fnmBed=paste(dirData, myBed, sep="") # ccdsHh19.bed   tmp.bed ## Read bed use read.delim - it is the  best mybed=read.delim(fnmBed, header = FALSE, sep = "\t", quote = ...

MATLAB cross validation

// use built-in function samplesize = size( matrix , 1); c = cvpartition(samplesize,  'kfold' , k); % return the indexes on each fold ///// output in matlab console K-fold cross validation partition              N: 10    NumTestSets: 4      TrainSize: 8  7  7  8       TestSize: 2  3  3  2 ////////////////////// for i=1 : k    trainIdxs = find(training(c,i) ); %training(c,i);  // 1 means in train , 0 means in test    testInxs  = find(test(c,i)       ); % test(c,i);       // 1 means in test , 0 means in train    trainMatrix = matrix (  matrix(trainIdxs ), : );    testMatrix  = matrix (  matrix(testIdxs  ), : ); end //// now calculate performance %%  calculate performance of a partiti...

MATLAB confusion matrix

%  test_class  & predicted_class must be same dimension % 'order' - describes the order of label. Here labels are 'g' as positive and 'h' as negative [C,order] = confusionmat( test_class(1: noSampleTest), predicted_class, 'order', ['g' ;'h'] ) tp = C(1,1); fn = C(1,2); fp = C(2,1); tn = C(2,2); sensitivity = tp /( tp + fn ) specificity = tn /( fp + tn ) accuracy = (tp+tn) / (tp+fn+fp+tn) tpr = sensitivity fpr = 1-specificity precision = tp /( tp + fp ) fVal = (2*tpr*precision)/(tpr+precision)