0. BIOPYTHON mafIO
==================
def mafParsing(self,chrName ):
idx = MafIndex(chrName + ".mafindex", chrName + ".maf", "hg19." + chrName )
multiple_alignment = idx.get_spliced([60000] , [60020] , strand = "-")
AlignIO.write(multiple_alignment, "hg19_All.fa", "fasta")
total_bases_tupBel1 = 0;
for seqrec in multiple_alignment:
if seqrec.id.startswith("tupBel1"): # mm4 tupBel1 tanvir
# don't count gaps as bases
total_bases_tupBel1 += len( str(seqrec.seq).replace("-", "") )
print "total of %s base align" %total_bases_tupBel1 + " %s" %total_bases_tupBel1
1. mafsInRegion
=====================================
desc: Find multi alignment output in different species
usage: ./mafsInRegion -keepInitialGaps -outDir test.bed out_dir chr22.maf
precaution:
1. use -outDir if you want gene name specific output in seperate file (1 file per 1 sample).
2. liftOver
=====================================
desc: Lift co-ordinate of one assembly over another assembly. Give output according to similarity match.
usage: ./liftOver old.chain new.chain mapped.out unMapped.out
precaution:
for single species use similaritythreshold 0.95
for multiple species use similarity thr 0.10
It is not optimized for multiple species
3. match@ TRANSFAC professional
=====================================
desc: Find hit location of matrix over sequence
usage: ./match matrix.dat promoter.seq map.out matrix.profile
precaution:
if a matrix's profile is not available in profile file, this matrix will not be mapped over sequence.
4.OrthoMCL
==================================
desc: Find orthologous gene from >=2 species
installation:
a. download the unsupported version. If you download full version you have to install mysql and others.
http://orthomcl.org/common/downloads/software/unsupported/v1.4/
There are 2 zip files. One for OrthoMCL and second one for MCL.
b. compile MCL: (Read the install file for help)
unzip
./configure
make
make install
c. change the path name in OrthoMCL folder file orthomcl_module.pm
for variable $BLASTALL , $FORMATDB , $MCL, $$PATH_TO_ORTHOMCL, $ORTHOMCL_DATA_DIR
d. Keep all the fasta file in $ORTHOMCL_DATA_DIR folder.
usage: (See README under OrthoMCLv1.4)
perl orthomcl.pl --mode 1 --fa_files Ath.fa,Hsa.fab
precaution
5. blastp
==========================
blastp -evalue 1e-05 -query testBee.prot.fa -subject testBee.prot.fa -outfmt 6 -out myout.txt
6. apriori algorithm
===========================
// min support 90% , maximal subset , unlimited #item in a itemset.
apriori.exe -tm -s90 gene_tf.input gene_tf_90.out
Comments
Post a Comment