Skip to main content

libsvm usage


FAQ
=====
http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#/Q4:_Training_and_prediction


DONWLOAD
========

Just need to download 1 zip file from main page. That's all.

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

INSTALL
==========
make

If you want to use parameter estimation, you need to change the code a bit and do following

make clean;
make install;


DATAFORMAT
==============
label  1:feat#1  2:feat#2  3:feat#3      N:feat#N


Some available data
===============
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/

Use heart_scale data. It works perfectly for all plot, cv and parameter estimation.

2-class CLASSIFICATION with RBF kernel with 5 fold CV
=================================

train (support vectors are generated):
./svm-train -s 0 -t 2   -g  0.03125 -c 0.25   train.dat  train.model

train with CV(No support  vectors are shown, just show your score: AUC,F-score)

./svm-train -s 0 -t 2 -v 5  train.dat  > train.cv

testing:
./svm-predict test.dat train.model test.output

Now parse output file (containing predicted label) to calculate sen, spe and accuracy.

Regression
================

train:
./svm-train -s 3  engine.train  engine.train.model
 testing:
./svm-predict  engine.train.model engine.output

Now parse output file (containing predicted values) to calculate RMS etc.

Parameter Searching for RBF kernel (only supported kernel)
====================================

http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/eval/index.html

a. Change in source file as mentioned in above link
b. use grid.py in folder tools
c. Read  the README file inside tools to select range of parameters.
d. use following command:

 python grid.py -log2c -5,5,1 -log2g -4,0,1 -v 10  ../../data/heart_scale

- it searches log2c of "c" parameter in range [-5,,5] with increment 1. And searches log2g of "g" parameter in range [-4,0] with increment 1.  with 10 fold CV using data heart_scale

e. Select the maximum score(i.e. AUC, F-score ) from the output file and it contains the log2(param) of kernel parameter. So, take inverseLog for final parameter.

f. If you wanna use other criteria besides AUC (default ) then change the

 double (*validation_function)(const dvec_t&, const ivec_t&) = auc;
in eval.cpp to the evaluation function you preferred. You can also assign "precision", "recall", "fscore", or "bac" here.

FEATURE SELECTION
===================



Windows
=============

svm-train -s 3  D:\matlabWorkspace\fuelPerfHeavyNapthaSVM\engine.train.svm D:\matlabWorkspace\fuelPerfHeavyNapthaSVM\engine.train.svm.model
svm-predict    D:\matlabWorkspace\fuelPerfHeavyNapthaSVM\engine.test.svm  D:\matlabWorkspace\fuelPerfHeavyNapthaSVM\engine.train.svm.model D:\matlabWorkspace\fuelPerfHeavyNapthaSVM\predicted

Comments

Popular posts from this blog

MATLAB cross validation

// use built-in function samplesize = size( matrix , 1); c = cvpartition(samplesize,  'kfold' , k); % return the indexes on each fold ///// output in matlab console K-fold cross validation partition              N: 10    NumTestSets: 4      TrainSize: 8  7  7  8       TestSize: 2  3  3  2 ////////////////////// for i=1 : k    trainIdxs = find(training(c,i) ); %training(c,i);  // 1 means in train , 0 means in test    testInxs  = find(test(c,i)       ); % test(c,i);       // 1 means in test , 0 means in train    trainMatrix = matrix (  matrix(trainIdxs ), : );    testMatrix  = matrix (  matrix(testIdxs  ), : ); end //// now calculate performance %%  calculate performance of a partition     selectedKfoldSen=[];selectedKfoldSpe=[];selectedKfoldAcc=[];     indexSen=1;indexSpe=1;indexAcc=1;     if ( kfold == (P+N) )% leave one out         sensitivity = sum(cvtp) /( sum(cvtp) + sum(cvfn) )         specificity = sum(cvtn) /( sum(cvfp) + sum(cvtn) )         acc

R tutorial

Install R in linux ============ In CRAN home page, the latest version is not available. So, in fedora, Open the terminal yum list R  --> To check the latest available version of r yum install R --> install R version yum update R --> update current version to latest one 0 find help ============ ?exact topic name (  i.e.   ?mean ) 0.0 INSTALL 3rd party package  ==================== install.packages('mvtnorm' , dependencies = TRUE , lib='/home/alamt/myRlibrary/')   #  install new package BED file parsing (Always use read.delim it is the best) library(MASS) #library(ggplot2) dirRoot="D:/research/F5shortRNA/TestRIKEN/Rscripts/" dirData="D:/research/F5shortRNA/TestRIKEN/" setwd(dirRoot) getwd() myBed="test.bed" fnmBed=paste(dirData, myBed, sep="") # ccdsHh19.bed   tmp.bed ## Read bed use read.delim - it is the  best mybed=read.delim(fnmBed, header = FALSE, sep = "\t", quote = &q