DaveM@25
|
1 function [linkList, featureList]= treeLinkFeatures(data, depthThresh, featureNames)
|
DaveM@28
|
2 %% [linkList, featureList]= treeLinkFeatures(data, depthThresh, featureNames)
|
DaveM@10
|
3 % given a dataset, a hierarchical cluster of the data is produced, and then
|
DaveM@10
|
4 % the data is traversed, such that, for each split in the data, a set of
|
DaveM@10
|
5 % features are produced, which are the ranked features that can be used to
|
DaveM@10
|
6 % separate the given dataset at that point.
|
DaveM@28
|
7 % data is the nxm matrix of content, n is the number of samples and m is
|
DaveM@28
|
8 % the number of features.
|
DaveM@28
|
9 % depthThresh is a list of the range of tree depths to traverse from the
|
DaveM@28
|
10 % aglomerative clustering tree. A single value of depthThresh, will assume
|
DaveM@28
|
11 % 1:depthThresh. For analysis of a single layer of the tree, pass a list of
|
DaveM@28
|
12 % two values, both of which are the layer to be analysed.
|
DaveM@28
|
13 % feature names is the list of features, so that grown trees have suitable
|
DaveM@28
|
14 % names. No feature names will result in the feature number being returned.
|
DaveM@28
|
15 % featureList corresponds to the rows in linkList, with the form column 1
|
DaveM@28
|
16 % is the 5 most relevant features, column 2 is the depth and column 3 is a
|
DaveM@28
|
17 % decision classification tree for the decision - perhaps this should be in
|
DaveM@28
|
18 % the form of a struct instead?
|
DaveM@28
|
19
|
DaveM@10
|
20
|
DaveM@9
|
21
|
DaveM@25
|
22 if(nargin < 3)
|
DaveM@25
|
23 featureNames = 1:size(data,2);
|
DaveM@25
|
24 end
|
DaveM@16
|
25 if(nargin < 2)
|
DaveM@19
|
26 depthThresh = 999;
|
DaveM@16
|
27 end
|
DaveM@27
|
28
|
DaveM@31
|
29 if (length(depthThresh) == 1)
|
DaveM@27
|
30 depthThresh = 1:depthThresh;
|
DaveM@27
|
31 end
|
DaveM@27
|
32
|
DaveM@9
|
33 linkList = aglomCluster(data);
|
DaveM@16
|
34 linkList = depthCheck(linkList);
|
DaveM@10
|
35 listSize = size(data,1);
|
DaveM@9
|
36
|
DaveM@10
|
37 % linkList(:,4) = 0;
|
DaveM@24
|
38 featureList = cell(listSize-1,3);
|
DaveM@10
|
39 currentRow = [2*listSize-1];
|
DaveM@9
|
40
|
DaveM@12
|
41 %%
|
DaveM@15
|
42 while (~isempty(currentRow))
|
DaveM@16
|
43 if(currentRow(1) > listSize)
|
DaveM@30
|
44 row = currentRow(1) - listSize
|
DaveM@17
|
45 % rD = linkList(row,4);
|
DaveM@27
|
46 if any(linkList(row,4)==depthThresh)
|
DaveM@16
|
47 classList = traceLinkageToBinary(linkList, row);
|
DaveM@23
|
48 featureList{row,1} = rfFeatureSelection(data(classList>0,:), classList(classList>0));
|
DaveM@17
|
49 featureList{row,2} = linkList(row,4);
|
DaveM@24
|
50 featureList{row,3} = fitctree(data(classList>0,featureList{row,1}),classList(classList>0),'PredictorNames',featureNames(featureList{row,1}));
|
DaveM@16
|
51 end
|
DaveM@17
|
52 currentRow = [currentRow; linkList(row,1); linkList(row,2)];
|
DaveM@10
|
53 end
|
DaveM@17
|
54 currentRow = currentRow(2:end);
|
DaveM@29
|
55 save('partialResults.mat');
|
DaveM@9
|
56 end
|
DaveM@9
|
57
|
DaveM@9
|
58 end |