camir-aes2014: toolboxes/MIRtoolbox1.3.2/somtoolbox/som

annotate toolboxes/MIRtoolbox1.3.2/somtoolbox/som_demo1.m @ 0:e9a9cd732c1e tip

first hg version after svn

author	wolffd
date	Tue, 10 Feb 2015 15:05:51 +0000
parents
children

rev	line source
wolffd@0	1
wolffd@0	2 %SOM_DEMO1 Basic properties and behaviour of the Self-Organizing Map.
wolffd@0	3
wolffd@0	4 % Contributed to SOM Toolbox 2.0, February 11th, 2000 by Juha Vesanto
wolffd@0	5 % http://www.cis.hut.fi/projects/somtoolbox/
wolffd@0	6
wolffd@0	7 % Version 1.0beta juuso 071197
wolffd@0	8 % Version 2.0beta juuso 030200
wolffd@0	9
wolffd@0	10 clf reset;
wolffd@0	11 figure(gcf)
wolffd@0	12 echo on
wolffd@0	13
wolffd@0	14
wolffd@0	15
wolffd@0	16 clc
wolffd@0	17 % ==========================================================
wolffd@0	18 % SOM_DEMO1 - BEHAVIOUR AND PROPERTIES OF SOM
wolffd@0	19 % ==========================================================
wolffd@0	20
wolffd@0	21 % som_make - Create, initialize and train a SOM.
wolffd@0	22 % som_randinit - Create and initialize a SOM.
wolffd@0	23 % som_lininit - Create and initialize a SOM.
wolffd@0	24 % som_seqtrain - Train a SOM.
wolffd@0	25 % som_batchtrain - Train a SOM.
wolffd@0	26 % som_bmus - Find best-matching units (BMUs).
wolffd@0	27 % som_quality - Measure quality of SOM.
wolffd@0	28
wolffd@0	29 % SELF-ORGANIZING MAP (SOM):
wolffd@0	30
wolffd@0	31 % A self-organized map (SOM) is a "map" of the training data,
wolffd@0	32 % dense where there is a lot of data and thin where the data
wolffd@0	33 % density is low.
wolffd@0	34
wolffd@0	35 % The map constitutes of neurons located on a regular map grid.
wolffd@0	36 % The lattice of the grid can be either hexagonal or rectangular.
wolffd@0	37
wolffd@0	38 subplot(1,2,1)
wolffd@0	39 som_cplane('hexa',[10 15],'none')
wolffd@0	40 title('Hexagonal SOM grid')
wolffd@0	41
wolffd@0	42 subplot(1,2,2)
wolffd@0	43 som_cplane('rect',[10 15],'none')
wolffd@0	44 title('Rectangular SOM grid')
wolffd@0	45
wolffd@0	46 % Each neuron (hexagon on the left, rectangle on the right) has an
wolffd@0	47 % associated prototype vector. After training, neighboring neurons
wolffd@0	48 % have similar prototype vectors.
wolffd@0	49
wolffd@0	50 % The SOM can be used for data visualization, clustering (or
wolffd@0	51 % classification), estimation and a variety of other purposes.
wolffd@0	52
wolffd@0	53 pause % Strike any key to continue...
wolffd@0	54
wolffd@0	55 clf
wolffd@0	56 clc
wolffd@0	57 % INITIALIZE AND TRAIN THE SELF-ORGANIZING MAP
wolffd@0	58 % ============================================
wolffd@0	59
wolffd@0	60 % Here are 300 data points sampled from the unit square:
wolffd@0	61
wolffd@0	62 D = rand(300,2);
wolffd@0	63
wolffd@0	64 % The map will be a 2-dimensional grid of size 10 x 10.
wolffd@0	65
wolffd@0	66 msize = [10 10];
wolffd@0	67
wolffd@0	68 % SOM_RANDINIT and SOM_LININIT can be used to initialize the
wolffd@0	69 % prototype vectors in the map. The map size is actually an
wolffd@0	70 % optional argument. If omitted, it is determined automatically
wolffd@0	71 % based on the amount of data vectors and the principal
wolffd@0	72 % eigenvectors of the data set. Below, the random initialization
wolffd@0	73 % algorithm is used.
wolffd@0	74
wolffd@0	75 sMap = som_randinit(D, 'msize', msize);
wolffd@0	76
wolffd@0	77 % Actually, each map unit can be thought as having two sets
wolffd@0	78 % of coordinates:
wolffd@0	79 % (1) in the input space: the prototype vectors
wolffd@0	80 % (2) in the output space: the position on the map
wolffd@0	81 % In the two spaces, the map looks like this:
wolffd@0	82
wolffd@0	83 subplot(1,3,1)
wolffd@0	84 som_grid(sMap)
wolffd@0	85 axis([0 11 0 11]), view(0,-90), title('Map in output space')
wolffd@0	86
wolffd@0	87 subplot(1,3,2)
wolffd@0	88 plot(D(:,1),D(:,2),'+r'), hold on
wolffd@0	89 som_grid(sMap,'Coord',sMap.codebook)
wolffd@0	90 title('Map in input space')
wolffd@0	91
wolffd@0	92 % The black dots show positions of map units, and the gray lines
wolffd@0	93 % show connections between neighboring map units. Since the map
wolffd@0	94 % was initialized randomly, the positions in in the input space are
wolffd@0	95 % completely disorganized. The red crosses are training data.
wolffd@0	96
wolffd@0	97 pause % Strike any key to train the SOM...
wolffd@0	98
wolffd@0	99 % During training, the map organizes and folds to the training
wolffd@0	100 % data. Here, the sequential training algorithm is used:
wolffd@0	101
wolffd@0	102 sMap = som_seqtrain(sMap,D,'radius',[5 1],'trainlen',10);
wolffd@0	103
wolffd@0	104 subplot(1,3,3)
wolffd@0	105 som_grid(sMap,'Coord',sMap.codebook)
wolffd@0	106 hold on, plot(D(:,1),D(:,2),'+r')
wolffd@0	107 title('Trained map')
wolffd@0	108
wolffd@0	109 pause % Strike any key to view more closely the training process...
wolffd@0	110
wolffd@0	111
wolffd@0	112 clf
wolffd@0	113
wolffd@0	114 clc
wolffd@0	115 % TRAINING THE SELF-ORGANIZING MAP
wolffd@0	116 % ================================
wolffd@0	117
wolffd@0	118 % To get a better idea of what happens during training, let's look
wolffd@0	119 % at how the map gradually unfolds and organizes itself. To make it
wolffd@0	120 % even more clear, the map is now initialized so that it is away
wolffd@0	121 % from the data.
wolffd@0	122
wolffd@0	123 sMap = som_randinit(D,'msize',msize);
wolffd@0	124 sMap.codebook = sMap.codebook + 1;
wolffd@0	125
wolffd@0	126 subplot(1,2,1)
wolffd@0	127 som_grid(sMap,'Coord',sMap.codebook)
wolffd@0	128 hold on, plot(D(:,1),D(:,2),'+r'), hold off
wolffd@0	129 title('Data and original map')
wolffd@0	130
wolffd@0	131 % The training is based on two principles:
wolffd@0	132 %
wolffd@0	133 % Competitive learning: the prototype vector most similar to a
wolffd@0	134 % data vector is modified so that it it is even more similar to
wolffd@0	135 % it. This way the map learns the position of the data cloud.
wolffd@0	136 %
wolffd@0	137 % Cooperative learning: not only the most similar prototype
wolffd@0	138 % vector, but also its neighbors on the map are moved towards the
wolffd@0	139 % data vector. This way the map self-organizes.
wolffd@0	140
wolffd@0	141 pause % Strike any key to train the map...
wolffd@0	142
wolffd@0	143 echo off
wolffd@0	144 subplot(1,2,2)
wolffd@0	145 o = ones(5,1);
wolffd@0	146 r = (1-[1:60]/60);
wolffd@0	147 for i=1:60,
wolffd@0	148 sMap = som_seqtrain(sMap,D,'tracking',0,...
wolffd@0	149 'trainlen',5,'samples',...
wolffd@0	150 'alpha',0.1o,'radius',(4r(i)+1)*o);
wolffd@0	151 som_grid(sMap,'Coord',sMap.codebook)
wolffd@0	152 hold on, plot(D(:,1),D(:,2),'+r'), hold off
wolffd@0	153 title(sprintf('%d/300 training steps',5*i))
wolffd@0	154 drawnow
wolffd@0	155 end
wolffd@0	156 title('Sequential training after 300 steps')
wolffd@0	157 echo on
wolffd@0	158
wolffd@0	159 pause % Strike any key to continue with 3D data...
wolffd@0	160
wolffd@0	161 clf
wolffd@0	162
wolffd@0	163 clc
wolffd@0	164 % TRAINING DATA: THE UNIT CUBE
wolffd@0	165 % ============================
wolffd@0	166
wolffd@0	167 % Above, the map dimension was equal to input space dimension: both
wolffd@0	168 % were 2-dimensional. Typically, the input space dimension is much
wolffd@0	169 % higher than the 2-dimensional map. In this case the map cannot
wolffd@0	170 % follow perfectly the data set any more but must find a balance
wolffd@0	171 % between two goals:
wolffd@0	172
wolffd@0	173 % - data representation accuracy
wolffd@0	174 % - data set topology representation accuracy
wolffd@0	175
wolffd@0	176 % Here are 500 data points sampled from the unit cube:
wolffd@0	177
wolffd@0	178 D = rand(500,3);
wolffd@0	179
wolffd@0	180 subplot(1,3,1), plot3(D(:,1),D(:,2),D(:,3),'+r')
wolffd@0	181 view(3), axis on, rotate3d on
wolffd@0	182 title('Data')
wolffd@0	183
wolffd@0	184 % The ROTATE3D command enables you to rotate the picture by
wolffd@0	185 % dragging the pointer above the picture with the leftmost mouse
wolffd@0	186 % button pressed down.
wolffd@0	187
wolffd@0	188 pause % Strike any key to train the SOM...
wolffd@0	189
wolffd@0	190
wolffd@0	191
wolffd@0	192
wolffd@0	193 clc
wolffd@0	194 % DEFAULT TRAINING PROCEDURE
wolffd@0	195 % ==========================
wolffd@0	196
wolffd@0	197 % Above, the initialization was done randomly and training was done
wolffd@0	198 % with sequential training function (SOM_SEQTRAIN). By default, the
wolffd@0	199 % initialization is linear, and batch training algorithm is
wolffd@0	200 % used. In addition, the training is done in two phases: first with
wolffd@0	201 % large neighborhood radius, and then finetuning with small radius.
wolffd@0	202
wolffd@0	203 % The function SOM_MAKE can be used to both initialize and train
wolffd@0	204 % the map using default parameters:
wolffd@0	205
wolffd@0	206 pause % Strike any key to use SOM_MAKE...
wolffd@0	207
wolffd@0	208 sMap = som_make(D);
wolffd@0	209
wolffd@0	210 % Here, the linear initialization is done again, so that
wolffd@0	211 % the results can be compared.
wolffd@0	212
wolffd@0	213 sMap0 = som_lininit(D);
wolffd@0	214
wolffd@0	215 subplot(1,3,2)
wolffd@0	216 som_grid(sMap0,'Coord',sMap0.codebook,...
wolffd@0	217 'Markersize',2,'Linecolor','k','Surf',sMap0.codebook(:,3))
wolffd@0	218 axis([0 1 0 1 0 1]), view(-120,-25), title('After initialization')
wolffd@0	219
wolffd@0	220 subplot(1,3,3)
wolffd@0	221 som_grid(sMap,'Coord',sMap.codebook,...
wolffd@0	222 'Markersize',2,'Linecolor','k','Surf',sMap.codebook(:,3))
wolffd@0	223 axis([0 1 0 1 0 1]), view(3), title('After training'), hold on
wolffd@0	224
wolffd@0	225 % Here you can see that the 2-dimensional map has folded into the
wolffd@0	226 % 3-dimensional space in order to be able to capture the whole data
wolffd@0	227 % space.
wolffd@0	228
wolffd@0	229 pause % Strike any key to evaluate the quality of maps...
wolffd@0	230
wolffd@0	231
wolffd@0	232
wolffd@0	233 clc
wolffd@0	234 % BEST-MATCHING UNITS (BMU)
wolffd@0	235 % =========================
wolffd@0	236
wolffd@0	237 % Before going to the quality, an important concept needs to be
wolffd@0	238 % introduced: the Best-Matching Unit (BMU). The BMU of a data
wolffd@0	239 % vector is the unit on the map whose model vector best resembles
wolffd@0	240 % the data vector. In practise the similarity is measured as the
wolffd@0	241 % minimum distance between data vector and each model vector on the
wolffd@0	242 % map. The BMUs can be calculated using function SOM_BMUS. This
wolffd@0	243 % function gives the index of the unit.
wolffd@0	244
wolffd@0	245 % Here the BMU is searched for the origin point (from the
wolffd@0	246 % trained map):
wolffd@0	247
wolffd@0	248 bmu = som_bmus(sMap,[0 0 0]);
wolffd@0	249
wolffd@0	250 % Here the corresponding unit is shown in the figure. You can
wolffd@0	251 % rotate the figure to see better where the BMU is.
wolffd@0	252
wolffd@0	253 co = sMap.codebook(bmu,:);
wolffd@0	254 text(co(1),co(2),co(3),'BMU','Fontsize',20)
wolffd@0	255 plot3([0 co(1)],[0 co(2)],[0 co(3)],'ro-')
wolffd@0	256
wolffd@0	257 pause % Strike any key to analyze map quality...
wolffd@0	258
wolffd@0	259
wolffd@0	260
wolffd@0	261
wolffd@0	262 clc
wolffd@0	263 % SELF-ORGANIZING MAP QUALITY
wolffd@0	264 % ===========================
wolffd@0	265
wolffd@0	266 % The maps have two primary quality properties:
wolffd@0	267 % - data representation accuracy
wolffd@0	268 % - data set topology representation accuracy
wolffd@0	269
wolffd@0	270 % The former is usually measured using average quantization error
wolffd@0	271 % between data vectors and their BMUs on the map. For the latter
wolffd@0	272 % several measures have been proposed, e.g. the topographic error
wolffd@0	273 % measure: percentage of data vectors for which the first- and
wolffd@0	274 % second-BMUs are not adjacent units.
wolffd@0	275
wolffd@0	276 % Both measures have been implemented in the SOM_QUALITY function.
wolffd@0	277 % Here are the quality measures for the trained map:
wolffd@0	278
wolffd@0	279 [q,t] = som_quality(sMap,D)
wolffd@0	280
wolffd@0	281 % And here for the initial map:
wolffd@0	282
wolffd@0	283 [q0,t0] = som_quality(sMap0,D)
wolffd@0	284
wolffd@0	285 % As can be seen, by folding the SOM has reduced the average
wolffd@0	286 % quantization error, but on the other hand the topology
wolffd@0	287 % representation capability has suffered. By using a larger final
wolffd@0	288 % neighborhood radius in the training, the map becomes stiffer and
wolffd@0	289 % preserves the topology of the data set better.
wolffd@0	290
wolffd@0	291
wolffd@0	292 echo off
wolffd@0	293
wolffd@0	294

Mercurial > hg > camir-aes2014

annotate toolboxes/MIRtoolbox1.3.2/somtoolbox/som_demo1.m @ 0:e9a9cd732c1e tip