1. Text Processing
Cut: only selected column: 1 BASED index
// check the unique value on 7 th column
cut -f 7 amel_OGSv3.2.gff3 | sort -u | wc -l
default is tab delimited
cut -f 1,3 fName
if other is used as delimiter
cut -f 1,3 -d ':' fName
GREP
// to find exact word use -w
// it will find as a substirng/word , But if you want exact word use -w
grep -iw "gene*" amel_OGSv3.2.gff3 > ./amel_OGSv3.2.gene.gff3
grep -w "[-]9" fname // find word -9 in file.
grep multiple words in file
grep 'good\|bad' test.txt
The following command line will grep from 1 lines [before] the match through 1000 lines [after] the match.
grep "^AC P0001" factor.table -B1 -A1000 > out.txt
“^AC P0001” is a regular expression. The carrot (^) means the start of a line. So, the quoted text means to find the line that starts with the dealer name AC P0001. The -B1 gives us 1 line before the match. The -A1000 gives us 1000 lines after the match.
SORT , 1 BASED INDEX
sort in multiple column
sort -k1,1 -k2,2n input.txt
if you want to sort descending add r (by default it is ascending)
sort -k1,1r -k2,2nr input.txt
SED
copy lines from any position
sed -n '10,20p'
AWK , 1 based index (By default AWK use whitespace as delimiter, if you want to change use FS=YourDelimiter )
For CSV file,
( if second column is 420 and third column is 1 in a CSV file, then select those lines )
awk '$2==420 && $3==1' FS=, input.csv > output.csv
awk '$2 > 0' input.txt > output.txt
// if field 3 contains exon, then output field 20
awk '{ if($3=="exon" ){print $20 }}' input.txt | sort -u > output.txt
// if field 3 contains exon, then output all fields(so no need mention anything after print )
awk '{ if($3=="exon" ){print }}' input.txt > output.txt
// passing variable in awk [ must surround variable with quote ]
// if the field 5 greater than variable, print lines
awk '{ if($5 ≥ '"$variable"' ){print }}' input.txt > output.txt
Loop in shell
# Loop in array using index
declare -a myarray=("GM12878" "K562" "H9ES")
for i in `seq 0 3`
do
echo $i
echo ${myarray[$i]}
done
# Loop over all files inside a number
totCount=0
allValues=$(ls $dirOntologizerInputTarget/*.input)
declare -a arrInput=$allValues # ("E003" "E008" )
for filename in ${arrInput[@]}
do
namePrefix=$(basename $filename ".input")
totCount=$(($totCount+1))
echo $namePrefix
done
echo $totCount
# From a range of number
i=1;
echo $i;
max=999999
for i in `seq 2 $max`
do
echo "$i"
done
# From a list
declare -a myarr=("element1" "element2" "element3")
## now loop through the above array
for i in "${myarr[@]}"
do
echo $i
done
# From an List
allStim[0]="mouse_macrophage_TB_infection_IFNg.counts.csv"
allStim[1]="mouse_macrophage_TB_infection_IL4.counts.csv"
allStim[2]="mouse_macrophage_TB_infection_IL13.counts.csv"
allStim[3]="mouse_macrophage_TB_infection_IL4-IL13.counts.csv"
# for test in $allStim
for i in `seq 0 3`
do
# echo $i;
echo ${allStim[$i]}
done
# From an array
declare -a arr=("GM12878" "K562" "H9ES")
for i in ${arr[@]}
do
# echo $i
cellLine=$i
done
Break
if [ "$totCount" -eq 3 ] ; then
# $arrInput=""
break;
fi
If else in shell
checkOntotlogizer=false
if [ $checkOntotlogizer == true ] ; then
# do your work
fi
Example2
# Prompt for a user name... echo "Please enter your age:" read AGE if [ "$AGE" -lt 20 ] || [ "$AGE" -ge 50 ]; then echo "Sorry, you are out of the age range." elif [ "$AGE" -ge 20 ] && [ "$AGE" -lt 30 ]; then echo "You are in your 20s" elif [ "$AGE" -ge 30 ] && [ "$AGE" -lt 40 ]; then echo "You are in your 30s" elif [ "$AGE" -ge 40 ] && [ "$AGE" -lt 50 ]; then echo "You are in your 40s" fi
Rename all files with extension to another extension
dirCheck="/home/alamt/F5shortRNA/result_v3/resultOntologizer/functionBasedOnTarget/backgroundHuman/"
extTSV=".tsv"
newExt=".xls"
allValues=$(ls $dirCheck/*$extTSV)
declare -a arrInput=$allValues # ("E003" "E008" )
totCount=0
for filename in ${arrInput[@]}
do
namePrefix=$(basename $filename $extTSV)
totCount=$(($totCount+1))
mv $dirCheck/$namePrefix$extTSV $dirCheck/$namePrefix$newExt
#if [ "$totCount" -ge 2 ] ; then
# break;
#fi
done
2. Soft Link(use soft link for big file if we do not want to copy it in multiple place
ln -s basic.file softlink.file
3. vi editor
http://www.guckes.net/vi/substitute.html
a. Replace all occurrence of word by another
: %s/old/new/g
Remove "> "space by i.e "> ab" to "ab"
: %s/> //g
ALWAYS USE FIRST BRACKET () BY \( \) to enclose pattern
%s/_\(.\)*/_HUMAN/g ==> i.e abc_EXT --> abc_HUMAN
FIND : /Search exact / pattern FILE
locate : simplest only provide location, less powerful
http://www.codecoffee.com/tipsforlinux/articles/20.html
find: robust lot of options ,more powerful
http://www.codecoffee.com/tipsforlinux/articles/21.html
find /home/alamt/ -name file.txt
find /home/alamt/ -name file*.txt
Shell multi line comment using vi editor
Comment from line 50 to 100
:50,100s/^/#
Uncomment from line 50 to 100
:50,100s/^#/
4. WGET to download files recursively
cd /projects/dragon/FANTOM5/
wget -r --user myid --password mypass --force-html -i https://fantom5-collaboration.gsc.riken.jp/webdav/lncRNAome_draft/Human/data/catalog/
5. Background and foreground process management
NOHUP: Linux process management in foreground background and background using nohup
http://stackoverflow.com/questions/9190151/how-to-run-a-shell-script-in-the-backgroung-and-get-no-output
http://linuxg.net/how-to-manage-background-and-foreground-processes/
http://unix.stackexchange.com/questions/45025/how-to-suspend-and-bring-a-background-process-to-foreground
How to run a job in background
======================
nohup /path/to/your/script.sh > /dev/null 2>&1 &
How to view background process
======================
jobs
How to move job from foreground to background and vice versa
=======================================
fg %jobid
bg %jobid
How to kill background jobs
======================
kill -19 %job_id
RSYNC ( copy from loacal/remote to local/remote with synchronization)
rsync -avH --progress alamt@10.70.58.115:/home/alamt/CNC /destfolder/
Screen
http://news.softpedia.com/news/GNU-Screen-Tutorial-44274.shtml
- Keep in mind that screen's default command character is Ctrl+a (press the Ctrl key, hold it and press a, then release them both =====> Ctrl-a).
- Moreover, the command (letter) entered after Ctrl+a is case sensitive, so for example Ctrl+a n is a different command from Ctrl+a N.
1. open
===============
screen
OR screen -S screenname
2. list of screen
===============
screen -ls
3. Enter into a screen
================
screen -r screenID
4. De-attach screen
==============
ctrl-a +d
5. Kill permanenelty
==============
ctrl-a + K
OR
screen -X -S screenname kill (tested killing)
6. Create new terminal/shell under a screen
============================
if you work is some software, you have to create different terminal in a screen.
// to create new command terminal
ctrl-a c
// shift among terminals. 10 terminals can be made ( 0 - 9)
ctrl-a SHIFT ( 0 ==> 1)
ctrl-a SHIFT ( 1 ==> 2)
ctrl-a SHIFT ( 2 ==> 3)
ctrl-a SHIFT ( 8 ==> 9)
ctrl-a SHIFT ( 9 ==> 0)
// Detach from terminal and scree
go to a terminal where no software is running. it is on basic terminal.
Then ctrl-a d to deattach from screen.
6. System related commands: Linux Processor and memory information
To get the number of processor
====================
less /proc/cpuinfo | grep processor
To get the details of each processor and number of core
=======================
# very detailed information
less /proc/cpuinfo | less
# summary information how many core/processor etc
lscpu
To get the RAM size
=============
1. In human readable format
free -g
free -m
2. In more details
less /proc/meminfo
To get the disk size
=============
1. In human readable format
df -h
Total size of a folder
===============
Enter into folder and then
du -ch | grep total
Comments
Post a Comment