Skip to main content

matlab matrix to weka .arff format conversion


inputFormat ( matrix are tab seperated , last column indicates the label, This is for two class problem,
for multiclass need to change the code in few lines)
======================================================================
5.0  6.5  7.9 +1
6.6  8.9  6.1 -1
code
=======
function matlabToarff


% convert matrix int arff(Attribute relation file format )format
clc;
fNameData = 'seqLabel';
fNameARFF = 'seqLabel.arff';

fidARFF = fopen( fNameARFF ,'w');
matrix = load(fNameData);
feature = matrix ( : , 1:end-1);
label = matrix (: , end) ;
noFeature = size(feature,2);
noSample = size(feature,1);

%%%%%%%%%% header

fprintf(fidARFF,'%s\n\n','@RELATION LNCRNAsequence');
for i=1:noFeature % noFeature
         fprintf(fidARFF,'%s\t%d\t%s\n' ,'@ATTRIBUTE' , i, 'NUMERIC' );
end
fprintf(fidARFF,'%s\n\n','@ATTRIBUTE class {+1,-1 }');

%%%%%%%%%%  data
fprintf(fidARFF,'%s\n','@DATA');
for r=1:noSample
     for c=1:noFeature
          fprintf(fidARFF,'%f,',matrix(r,c) );
     end
     if label(r)==1
            fprintf(fidARFF,'%s\n', '+1');
     else
            fprintf(fidARFF,'%s\n', '-1');
     end
end

fclose(fidARFF);

end

Comments

  1. if it possible to convert weka arff format to matrix in matlab

    ReplyDelete

Post a Comment

Popular posts from this blog

MATLAB cross validation

// use built-in function samplesize = size( matrix , 1); c = cvpartition(samplesize,  'kfold' , k); % return the indexes on each fold ///// output in matlab console K-fold cross validation partition              N: 10    NumTestSets: 4      TrainSize: 8  7  7  8       TestSize: 2  3  3  2 ////////////////////// for i=1 : k    trainIdxs = find(training(c,i) ); %training(c,i);  // 1 means in train , 0 means in test    testInxs  = find(test(c,i)       ); % test(c,i);       // 1 means in test , 0 means in train    trainMatrix = matrix (  matrix(trainIdxs ), : );    testMatrix  = matrix (  matrix(testIdxs  ), : ); end //// now calculate performance %%  calculate performance of a partition     selectedKfoldSen=[];selectedKfoldSpe=[];selectedKfoldAcc=[];     indexSen=1;indexSpe=1;indexAcc=1;     if ( kfold == (P+N) )% leave one out         sensitivity = sum(cvtp) /( sum(cvtp) + sum(cvfn) )         specificity = sum(cvtn) /( sum(cvfp) + sum(cvtn) )         acc

R tutorial

Install R in linux ============ In CRAN home page, the latest version is not available. So, in fedora, Open the terminal yum list R  --> To check the latest available version of r yum install R --> install R version yum update R --> update current version to latest one 0 find help ============ ?exact topic name (  i.e.   ?mean ) 0.0 INSTALL 3rd party package  ==================== install.packages('mvtnorm' , dependencies = TRUE , lib='/home/alamt/myRlibrary/')   #  install new package BED file parsing (Always use read.delim it is the best) library(MASS) #library(ggplot2) dirRoot="D:/research/F5shortRNA/TestRIKEN/Rscripts/" dirData="D:/research/F5shortRNA/TestRIKEN/" setwd(dirRoot) getwd() myBed="test.bed" fnmBed=paste(dirData, myBed, sep="") # ccdsHh19.bed   tmp.bed ## Read bed use read.delim - it is the  best mybed=read.delim(fnmBed, header = FALSE, sep = "\t", quote = &q

MATLAB confusion matrix

%  test_class  & predicted_class must be same dimension % 'order' - describes the order of label. Here labels are 'g' as positive and 'h' as negative [C,order] = confusionmat( test_class(1: noSampleTest), predicted_class, 'order', ['g' ;'h'] ) tp = C(1,1); fn = C(1,2); fp = C(2,1); tn = C(2,2); sensitivity = tp /( tp + fn ) specificity = tn /( fp + tn ) accuracy = (tp+tn) / (tp+fn+fp+tn) tpr = sensitivity fpr = 1-specificity precision = tp /( tp + fp ) fVal = (2*tpr*precision)/(tpr+precision)