Skip to main content

Posts

JAVA set operation union intersection difference

// create Ordered set Set mySet = new LinkedHashSet (); mySet.add( str ); // union Set union = new HashSet (s1); union.addAll(s2); Set intersection = new HashSet (s1); intersection.retainAll(s2); Set difference = new HashSet (s1); difference.removeAll(s2);     /// iteratate set element String[] arr = (String[]) mySet.toArray(new String[ mySet.size()]); int setSize = arr.length; for(int c=0; c < setSize;c++) { if(c==setSize-1) bufCell_Uniprot.append(arr[c]+"\n"); else bufCell_Uniprot.append(arr[c]+","); }   /// iteratate set element Iterator itr = mySet.iterator(); while( itr.hasNext()) { bout.write( itr.next() +"\n"); }

LINUX shell script tutorial

  1. Text Processing  Cut:  only selected column: 1 BASED index // check the unique value on 7 th column cut -f 7  amel_OGSv3.2.gff3 | sort -u | wc -l default is tab delimited cut   -f 1,3  fName if other is used as delimiter cut   -f 1,3 -d ':'  fName GREP copy line containing "gene*" // to find exact word use -w // it will find as a substirng/word , But if you want exact word use -w grep -iw "gene*" amel_OGSv3.2.gff3 > ./amel_OGSv3.2.gene.gff3 grep -w "[-]9" fname // find word -9 in file. grep multiple words in file  grep 'good\|bad' test.txt The following command line will grep from 1  lines [before] the match through 1000 lines [after] the match. grep "^AC P0001" factor.table -B1 -A1000 > out.txt “^AC P0001” is a regular expression. The carrot (^) means the start of a line. So, the quoted text means to find the line that starts with the dealer name AC P0001. The ...

Java string manipulation

Stringtokenizer ==============                 strLine = brAllrna.readLine(); // A                 StringTokenizer stringTokenizer = new StringTokenizer(strLine, " \t");                 Vector pwmA = new Vector ();                 while (stringTokenizer.hasMoreElements()) {                     Double val = Double.parseDouble(stringTokenizer.nextElement().toString() );                     pwmA.add(val);                                     }

matlab ROC curve

function makeROC() [X,Y,T,AUC] = perfcurve(  trueTestLabel ,  predictedScore , positiveClassLabel ); AUC plot(X,Y) xlabel('1-specificity'); ylabel('sensitivity'); figure plot(T, [Y  1-X   ] ); legend('sensitivity','specificity'); xlabel('Threshold'); ylabel('change in sensitivity and specificity') end function perfBasedOnROCthreshold thr = .41 rocInfo = load('roc20.info'); perfOrigLabel = rocInfo(: , 1); perfPredLabel = rocInfo(: , 2); perfPredScore = rocInfo(: , 3); noTeseCase = size(perfPredScore ,1); newPredictedLabel = zeros( noTeseCase,1); [X,Y,T,AUC] = perfcurve(perfOrigLabel, perfPredScore,1); plot(X,Y) xlabel('1-specificity'); ylabel('sensitivity'); figure plot(T,Y) xlabel('threshold'); ylabel('sensitivity'); figure plot(T,1-X) xlabel('threshold'); ylabel('specificity'); newPos = find(perfPredScore >= thr); newNeg = find(perfPredScor...

Feature Selection

1. Matlab using TreeBagger (it is actually like Random Forest) ========================================== load ionosphere; noBag = 5; myBag  = TreeBagger( noBag  ,   X, Y, 'OOBPred','on' , 'oobvarimp' ,'on' ); //  increase in prediction error if the values of that variable are permuted across OOB observations. //  The more increase in prediction Error ==> The more important the variable is oobVarImp = myBag.OOBPermutedVarDeltaError // re-substitution error varImp = zeros( noBag, noFeature) for i=1:noBag    varimportance( myBag.Trees{i}) end ========== COMPLETE CODE========== function fromRF load ionosphere; noBag = 5; myBag  = TreeBagger( noBag  ,   X, Y, 'OOBPred','on'); varRanking = zeros( noBag  , size(X,2) ) ; for i=1:noBag       [ val ,varRanking( i , :) ]= sort( varimportance( myBag.Trees{i}) ,'descend')    end // suppose finally taking top rank...

JAVA CLASSPATH setting

Source: http://weka.wikispaces.com/CLASSPATH Win32 (2k and XP) We assume that the mysql-connector-java-3.1.8-bin.jar archive is located in the following directory: C:\Program Files\Weka-3-4 In the Control Panel click on System (or right click on My Computer and select Properties ) and then go to the Advanced tab. There you will find a button called Environment Variables , click it. Depending on, whether you're the only person using this computer or it is a lab computer shared by many, you can either create a new system-wide (you are the only user) environment variable or a user dependent one (recommended for multi-user machines). Enter the following name for the variable CLASSPATH and add this value C:\Program Files\Weka-3-4\mysql-connector-java-3.1.8-bin.jar If you want to add additional jars, you'll have to separate them with the path separator, the semicolon ; (no spaces!). Unix/Linux I assume, that the mysql jar is located in ...

java memory allocation memory heap size control -Xmx -Xms

Taken from: http://javahowto.blogspot.com/2006/06/6-common-errors-in-setting-java-heap.html Two JVM options are often used to tune JVM heap size: -Xmx for maximum heap size, and -Xms for initial heap size. Here are some common mistakes I have seen when using them: Missing m, M, g or G at the end (they are case insensitive). For example, java -Xmx128 BigApp java.lang.OutOfMemoryError: Java heap space The correct command should be: java -Xmx128m BigApp . To be precise, -Xmx128 is a valid setting for very small apps, like HelloWorld. But in real life, I guess you really mean -Xmx128m Extra space in JVM options, or incorrectly use =. For example, java -Xmx 128m BigApp Invalid maximum heap size: -Xmx Could not create the Java virtual machine. java -Xmx=512m HelloWorld Invalid maximum heap size: -Xmx=512m Could not create the Java virtual machine. The correct command should be java -Xmx128m BigApp , with no whitespace nor =. -X options are different than -Dkey...