Skip to main content

SLURM tutorial : Basic commands


Main website for learning SLRUM


http://slurm.schedmd.com/tutorials.html

Submit a job with name and outputfile name(This will overwrite the parameters in shell file header )

sbatch   -J   job1  -o   job1.out  --partition=batch    myscript.sh

 

Basic shell script for job

#!/bin/sh
#
#SBATCH --job-name=testJob
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=dragon-default
#
# Display all variables set by slurm
env | grep "^SLURM" | sort

#
cd /projects/dragon/FANTOM5/processed_data_feature

## All my commands for job will go here

date;time;
mkdir t1

How to submit a batch job

sbatch myscript.sh

How to check the list of jobs of a user

squeue -u user1
squeue -u user1 -l  # it will show in details
 

How to check the whole history and status of a job

 scontrol show job=JOBID

 

How to use one particular node in interactive mode. Useful when all jobs are pending and you need to run a job



srun --pty --time=5:00:00 bash


How to kill job


  •  Cancel job 1234 along with all of its steps:
    •               scancel 1234
  •  Send SIGKILL to all steps of job 1235, but do not cancel the job itself:
    •               scancel --signal=KILL 1235
  •  Send SIGUSR1 to the batch shell processes of job 1236:
    •               scancel --signal=USR1 --batch 1236
  •  Cancel job all pending jobs belonging to user "bob" in partition "debug":
    •               scancel --state=PENDING --user=bob --partition=debug
  •  Cancel only array ID 4 of job array 1237
    •               scancel 1237_4

How to start a node in interactive mode


srun --pty --nodes=1 --exclusive --partition=interactive bash -l

How to start a  GUI in cluster


You need xserver to access the GUI

1. Login to cluster

 ssh -Y username@login.cbrc.kaust.edu.sa

Next, you'll need to get an interactive jobs started, for example to get a whole node in the interactive queue:

2.  Open an interactive node

srun --pty --nodes=1 --exclusive --partition=interactive bash -l
3. Suppose you want to use matlab in GUI, then do following two commands

module load matlab
matlab &

Full list of SLURM commands [ Source: http://www.tchpc.tcd.ie/node/129 ]

Man pages exist for all SLURM daemons, commands, and API functions. The command option --help also provides a brief summary of options. Note that the command options are all case insensitive.
  • sacct is used to report job or job step accounting information about active or completed jobs.
  • salloc is used to allocate resources for a job in real time. Typically this is used to allocate resources and spawn a shell. The shell is then used to execute srun commands to launch parallel tasks.
  • sattach is used to attach standard input, output, and error plus signal capabilities to a currently running job or job step. One can attach to and detach from jobs multiple times.
  • sbatch is used to submit a job script for later execution. The script will typically contain one or more srun commands to launch parallel tasks.
  • sbcast is used to transfer a file from local disk to local disk on the nodes allocated to a job. This can be used to effectively use diskless compute nodes or provide improved performance relative to a shared file system.
  • scancel is used to cancel a pending or running job or job step. It can also be used to send an arbitrary signal to all processes associated with a running job or job step.
  • scontrol is the administrative tool used to view and/or modify SLURM state. Note that many scontrol commands can only be executed as user root.
  • sinfo reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
  • smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
  • squeue reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
  • srun is used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
  • smap reports state information for jobs, partitions, and nodes managed by SLURM, but graphically displays the information to reflect network topology.
  • strigger is used to set, get or view event triggers. Event triggers include things such as nodes going down or jobs approaching their time limit.
  • sview is a graphical user interface to get and update state information for jobs, partitions, and nodes managed by SLURM.

 Handling multiple jobs by Job Array:


 http://slurm.schedmd.com/job_array.html




Comments

  1. Hi, I really loved reading this article. By this article i have learnt many things about OBIEE QAs, please keep me updating if there is any update.
    Teradata Online Training
    Teradata Training
    Teradata Online Course keep updating.........

    ReplyDelete

Post a Comment

Popular posts from this blog

MATLAB cross validation

// use built-in function samplesize = size( matrix , 1); c = cvpartition(samplesize,  'kfold' , k); % return the indexes on each fold ///// output in matlab console K-fold cross validation partition              N: 10    NumTestSets: 4      TrainSize: 8  7  7  8       TestSize: 2  3  3  2 ////////////////////// for i=1 : k    trainIdxs = find(training(c,i) ); %training(c,i);  // 1 means in train , 0 means in test    testInxs  = find(test(c,i)       ); % test(c,i);       // 1 means in test , 0 means in train    trainMatrix = matrix (  matrix(trainIdxs ), : );    testMatrix  = matrix (  matrix(testIdxs  ), : ); end //// now calculate performance %%  calculate performance of a partition     selectedKfoldSen=[];selectedKfoldSpe=[];selectedKfoldAcc=[];     indexSen=1;indexSpe=1;indexAcc=1;     if ( kfold == (P+N) )% leave one out         sensitivity = sum(cvtp) /( sum(cvtp) + sum(cvfn) )         specificity = sum(cvtn) /( sum(cvfp) + sum(cvtn) )         acc

R tutorial

Install R in linux ============ In CRAN home page, the latest version is not available. So, in fedora, Open the terminal yum list R  --> To check the latest available version of r yum install R --> install R version yum update R --> update current version to latest one 0 find help ============ ?exact topic name (  i.e.   ?mean ) 0.0 INSTALL 3rd party package  ==================== install.packages('mvtnorm' , dependencies = TRUE , lib='/home/alamt/myRlibrary/')   #  install new package BED file parsing (Always use read.delim it is the best) library(MASS) #library(ggplot2) dirRoot="D:/research/F5shortRNA/TestRIKEN/Rscripts/" dirData="D:/research/F5shortRNA/TestRIKEN/" setwd(dirRoot) getwd() myBed="test.bed" fnmBed=paste(dirData, myBed, sep="") # ccdsHh19.bed   tmp.bed ## Read bed use read.delim - it is the  best mybed=read.delim(fnmBed, header = FALSE, sep = "\t", quote = &q