DaveM@6: function featureVector = rfFeatureSelection(data, labels, numFeatures, iterMethod, numTrees, featureVector) DaveM@8: %% rfFeatureSelection(data, labels, numFeatures, iterMethod, numTrees, featureVector) DaveM@3: % DaveM@4: % using random forests to perform feature selection for a given data set DaveM@4: % data has size (x,y), where x is the number of labels and y, the number of DaveM@4: % features. DaveM@4: % labels is the set of labels for the data DaveM@4: % numFeatures is the dimension of the output vector (default 5) DaveM@4: % iterMethod is the method for which the features are cut down DaveM@5: % * 'onePass' will simply select the top (numFeatures) features and DaveM@5: % report them DaveM@5: % * 'cutX' will iteratively cut the bottom X percent of DaveM@5: % features out, and perform random forest feature selection on the DaveM@5: % new set, until the desired number of features has been returned DaveM@7: % * 'featureDeltaErr' will cut down the number of features based on DaveM@7: % the number of features that negatively impact the results, as given DaveM@7: % by the OOBPermutedVarDeltaError DaveM@6: % featureVector is a list of the features to use, for recursive purposes. DaveM@3: DaveM@3: if(length(labels) ~= size(data,1)) DaveM@3: error('labels and data do not match up'); DaveM@3: end DaveM@3: DaveM@3: if(nargin < 2) DaveM@3: error('must pass data and labels into function') DaveM@3: end DaveM@3: if(nargin < 3) DaveM@3: numFeatures = 5; DaveM@3: end DaveM@3: if(nargin < 4) DaveM@3: iterMethod = 'onePass'; DaveM@3: end DaveM@3: if(nargin < 5) DaveM@3: numTrees = 200; DaveM@3: end DaveM@6: if(nargin < 5) DaveM@6: featureVector = 1:size(data,2); DaveM@6: end DaveM@3: DaveM@3: DaveM@6: if(length(featureVector) > numFeatures) DaveM@6: options = statset('UseParallel', true); DaveM@6: b = TreeBagger(numTrees, data(:,featureVector), labels,'OOBVarImp','On',... DaveM@6: 'SampleWithReplacement', 'Off','FBoot', 0.632,'Options', options); DaveM@6: [FI,I] = sort(b.OOBPermutedVarDeltaError,'descend'); DaveM@6: featureVector = featureVector(I); DaveM@3: DaveM@6: if(strcmp(iterMethod,'onePass')) DaveM@6: featureVector = featureVector(1:numFeatures); DaveM@6: elseif(strcmp(iterMethod(1:3),'cut')) DaveM@6: cutPercentage = str2double(iterMethod(4:end)); DaveM@6: cutSize = max(floor(length(featureVector)*cutPercentage/100),1); DaveM@6: if(length(featureVector) - cutSize < numFeatures) DaveM@6: cutSize = length(featureVector) - numFeatures; DaveM@6: end DaveM@6: featureVector = featureVector(1:end-cutSize); DaveM@6: featureVector = rfFeatureSelection(data, labels, numFeatures, iterMethod, numTrees, featureVector); DaveM@6: elseif(strcmp(iterMethod,'featureDeltaErr')) DaveM@7: cutSize = sum(FI<0); DaveM@7: if(length(featureVector) - cutSize < numFeatures) DaveM@7: cutSize = length(featureVector) - numFeatures; DaveM@7: end DaveM@7: featureVector = featureVector(1:end-cutSize); DaveM@7: featureVector = rfFeatureSelection(data, labels, numFeatures, iterMethod, numTrees, featureVector); DaveM@6: end DaveM@3: end DaveM@3: end