Skip to main content

Posts

Showing posts from November, 2013

Feature subset selection Using Genetic Algorithm in MATLAB

function callGeneticAlgo global mat global trainInd global testInd [trainInd,~,testInd] = dividerand(1420,0.7,0,0.3); global counter global errList counter = 1; errList = []; fileName=  '../features/alltopPNPDMF.feature' ; mat = load(fileName); [x,fval,exitflag,output,population,score] = gaFeaSelection(1588,100,10800); % param1 = #feature excludig label % param2 =  population size % param3 = sec to test (3 hour = 10800 sec) dlmwrite('selected.GA',x,'delimiter','\n'); display('Done'); end function [x,fval,exitflag,output,population,score] = gaFeaSelection (nvars,PopulationSize_Data,TimeLimit_Data) % This is an auto generated MATLAB file from Optimization Tool. % Start with the default options options = gaoptimset; % Modify options setting options = gaoptimset(options,'PopulationType', 'bitString'); options = gaoptimset(options,'PopulationSize', PopulationSize_Data); options = gaoptimset(options,'TimeLimit', T

Feature subset selection toolbox collection

0. DEAP: DEAP: Evolutionary Algorithms Made Easy Genetic algorithm based multi-objective feature selection techniques. http://jmlr.org/papers/volume13/fortin12a/fortin12a.pdf 1. Weka Filter, Wrapper 2. Java-ML: A Machine Learning Library  http://jmlr.org/papers/volume10/abeel09a/abeel09a.pdf Entropy based methods (4) Stepwise addition/removal (2) SVMRFE Random forests Ensemble feature selection 3. MATLAB: Sequential feature selection: http://www.mathworks.com/help/stats/feature-selection.html Genetic Algorithm based: http://www.mathworks.com/matlabcentral/fileexchange/29553-feature-selector-based-on-genetic-algorithms-and-information-theory/content/GA_feature_selector.m 4. KELL http://sci2s.ugr.es/keel/algorithms.php#featureselection

Imbalanced set problems: Tools review to solve

1. Weka (Java Based) You can subsample the majority class (try the filter SpreadSubsample , GSVM-RU ).   You can oversample the minority class, creating synthetic examples (try SMOTE).   You can make your classifier cost sensitive (try the metaclassifier CostSensitiveClassifier).   http://weka.wikispaces.com/space/content?tag=cost-sensitive Each of the methods has it's own strengths and weaknesses, refer to the papers referenced in the documentation of each one.  If you use any of these and you need accurate probability estimates, you can use an isotonic regression to calibrate the output. 2. MATLAB RUSBoost algorithm available from fitensemble function.  An example is shown here http://www.mathworks.com/help/stats/ensemble-methods.html#btgw1m1 Additional suggestions  on imbalanced data here http://www.mathworks.com/matlabcentral/answers/11549-leraning-classification-with-most-training-samples-in-one-category This advice is a

MATLAB optimization toolbox usage with genetic algorithm

Useful tutorial http://www.mathworks.com/products/global-optimization/description3.html Best example of implementatoin with Constraint, objective function http://www.mathworks.com/help/gads/examples/constrained-minimization-using-the-genetic-algorithm.html More about how to use multi-objective http://www.mathworks.com/discovery/multiobjective-optimization.html http://www.mathworks.com/help/gads/examples/performing-a-multiobjective-optimization-using-the-genetic-algorithm.html http://www.mathworks.com/help/gads/examples/multiobjective-genetic-algorithm-options.html Example GAMULTOBJ (can handle Multiple Objective)  GA(can handle 1 objective) Constrained Minimization Problem We want to minimize a simple fitness function of two variables x1 and x2 min f(x) = 100 * (x1^2 - x2) ^2 + (1 - x1)^2; x min f(x) = 100 * (x1^2 + x2) ^2 + (1 + x1)^2; x such that the following two nonlinear constraints and bounds are satisfied x1*x2 + x1 - x2 + 1.5 <=0,