Mercurial > hg > camir-aes2014
comparison toolboxes/FullBNT-1.0.7/bnt/examples/static/dtree/test_restaurants.m @ 0:e9a9cd732c1e tip
first hg version after svn
author | wolffd |
---|---|
date | Tue, 10 Feb 2015 15:05:51 +0000 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:e9a9cd732c1e |
---|---|
1 % Here the training data is adapted from Russell95 book. See restaurant.names for description. | |
2 % (1) Use infomation-gain as the split testing score, we get the the same decision tree as the book Russell 95 (page 537), | |
3 % and the Gain(Patrons) is 0.5409, equal to the result in Page 541 of Russell 95. (see below output trace) | |
4 % (Note: the dtree in that book has small compilation error, the Type node is from YES of Hungry node, not NO.) | |
5 % (2) Use gain-ratio (Quilan 93), the splitting defavorite attribute with more values. (e.g. the Type attribute here) | |
6 | |
7 dtreeCPD=tree_CPD; | |
8 | |
9 % load data | |
10 fname = fullfile(BNT_HOME, 'examples', 'static', 'uci_data', 'restaurant', 'restaurant.data'); | |
11 data=load(fname); | |
12 data=data'; | |
13 | |
14 %make the data be BNT compliant (values for discrete nodes are from 1-n, here n is the node size) | |
15 % e.g. if the values are [0 1 6], they must be mapping to [1 2 3] | |
16 %data=transform_data(data,'tmp.dat',[]); %here no cts nodes | |
17 | |
18 % learn decision tree from data | |
19 ns=2*ones(1,11); | |
20 ns(5:6)=3; | |
21 ns(9:10)=4; | |
22 dtreeCPD1=learn_params(dtreeCPD,1:11,data,ns,[]); | |
23 | |
24 % evaluate on data | |
25 [score,outputs]=evaluate_tree_performance(dtreeCPD1,1:11,data,ns,[]); | |
26 fprintf('Accuracy in training data %6.3f\n',score); | |
27 | |
28 % show decision tree using graphpad | |
29 | |
30 | |
31 | |
32 % --------------------------Output trace: using Information-Gain------------------------------ | |
33 % The splits are Patron, Hungry, Type, Fri/Sat | |
34 % ********************************* | |
35 % Create node 1 split at 5 gain 0.5409 Th 0. Class 1 Cases 12 Error 6 | |
36 % Create leaf node(onecla) 2. Class 1 Cases 2 Error 0 | |
37 % Add subtree node 2 to 1. #nodes 2 | |
38 % Create leaf node(onecla) 3. Class 2 Cases 4 Error 0 | |
39 % Add subtree node 3 to 1. #nodes 3 | |
40 % Create node 4 split at 4 gain 0.2516 Th 0. Class 1 Cases 6 Error 2 | |
41 % Create leaf node(onecla) 5. Class 1 Cases 2 Error 0 | |
42 % Add subtree node 5 to 4. #nodes 5 | |
43 % Create node 6 split at 9 gain 0.5000 Th 0. Class 1 Cases 4 Error 2 | |
44 % Create leaf node(nullset) 7. Father 6 Class 1 | |
45 % Create node 8 split at 3 gain 1.0000 Th 0. Class 1 Cases 2 Error 1 | |
46 % Create leaf node(onecla) 9. Class 1 Cases 1 Error 0 | |
47 % Add subtree node 9 to 8. #nodes 9 | |
48 % Create leaf node(onecla) 10. Class 2 Cases 1 Error 0 | |
49 % Add subtree node 10 to 8. #nodes 10 | |
50 % Add subtree node 8 to 6. #nodes 10 | |
51 % Create leaf node(onecla) 11. Class 2 Cases 1 Error 0 | |
52 % Add subtree node 11 to 6. #nodes 11 | |
53 % Create leaf node(onecla) 12. Class 1 Cases 1 Error 0 | |
54 % Add subtree node 12 to 6. #nodes 12 | |
55 % Add subtree node 6 to 4. #nodes 12 | |
56 % Add subtree node 4 to 1. #nodes 12 | |
57 % ******************************** | |
58 % | |
59 % Note: | |
60 % ***Create node 4 split at 4 gain 0.2516 Th 0. Class 1 Cases 6 Error 2 | |
61 % This mean we create a new node number 4, it is splitting at the attribute 4, and info-gain is 0.2516, | |
62 % "Th 0" means threshhold for splitting continous attribute, "Class 1" means the majority class at node 4 is 1, | |
63 % and "Cases 6" means it has 6 cases attached to it, "Error 2" means it has two errors if changing the class lable of | |
64 % all the cases in it to the majority class. | |
65 % *** Add subtree node 12 to 6. #nodes 12 | |
66 % It means we add the child node 12 to node 6. | |
67 % *** Create leaf node(onecla) 10. Class 2 Cases 1 Error 0 | |
68 % here 'onecla' means all cases in this node belong to one class, so no need to split further. | |
69 % 'nullset' means no training cases belong to this node, we use its parent node majority class as its class | |
70 % | |
71 % | |
72 % | |
73 % ---------------Output trace: using GainRatio----------------------- | |
74 % The splits are Patron, Hungry, Fri/Sat, Price | |
75 % | |
76 % | |
77 % Create node 1 split at 5 gain 0.3707 Th 0. Class 1 Cases 12 Error 6 | |
78 % Create leaf node(onecla) 2. Class 1 Cases 2 Error 0 | |
79 % Add subtree node 2 to 1. #nodes 2 | |
80 % Create leaf node(onecla) 3. Class 2 Cases 4 Error 0 | |
81 % Add subtree node 3 to 1. #nodes 3 | |
82 % Create node 4 split at 4 gain 0.2740 Th 0. Class 1 Cases 6 Error 2 | |
83 % Create leaf node(onecla) 5. Class 1 Cases 2 Error 0 | |
84 % Add subtree node 5 to 4. #nodes 5 | |
85 % Create node 6 split at 3 gain 0.3837 Th 0. Class 1 Cases 4 Error 2 | |
86 % Create leaf node(onecla) 7. Class 1 Cases 1 Error 0 | |
87 % Add subtree node 7 to 6. #nodes 7 | |
88 % Create node 8 split at 6 gain 1.0000 Th 0. Class 2 Cases 3 Error 1 | |
89 % Create leaf node(onecla) 9. Class 2 Cases 2 Error 0 | |
90 % Add subtree node 9 to 8. #nodes 9 | |
91 % Create leaf node(nullset) 10. Father 8 Class 2 | |
92 % Create leaf node(onecla) 11. Class 1 Cases 1 Error 0 | |
93 % Add subtree node 11 to 8. #nodes 11 | |
94 % Add subtree node 8 to 6. #nodes 11 | |
95 % Add subtree node 6 to 4. #nodes 11 | |
96 % Add subtree node 4 to 1. #nodes 11 | |
97 % | |
98 % |