wolffd@0: %DEMGMM1 Demonstrate EM for Gaussian mixtures.
wolffd@0: %
wolffd@0: %	Description
wolffd@0: %	This script demonstrates the use of the EM algorithm to fit a mixture
wolffd@0: %	of Gaussians to a set of data using maximum likelihood. A colour
wolffd@0: %	coding scheme is used to illustrate the evaluation of the posterior
wolffd@0: %	probabilities in the E-step of the EM algorithm.
wolffd@0: %
wolffd@0: %	See also
wolffd@0: %	DEMGMM2, DEMGMM3, DEMGMM4, GMM, GMMEM, GMMPOST
wolffd@0: %
wolffd@0: 
wolffd@0: %	Copyright (c) Ian T Nabney (1996-2001)
wolffd@0: 
wolffd@0: clc;
wolffd@0: disp('This demonstration illustrates the use of the EM (expectation-')
wolffd@0: disp('maximization) algorithm for fitting of a mixture of Gaussians to a')
wolffd@0: disp('data set by maximum likelihood.')
wolffd@0: disp(' ')
wolffd@0: disp('The data set consists of 40 data points in a 2-dimensional')
wolffd@0: disp('space, generated by sampling from a mixture of 2 Gaussian')
wolffd@0: disp('distributions.')
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to see a plot of the data.')
wolffd@0: pause;
wolffd@0: 
wolffd@0: % Generate the data
wolffd@0: randn('state', 0); rand('state', 0);
wolffd@0: gmix = gmm(2, 2, 'spherical');
wolffd@0: ndat1 = 20; ndat2 = 20; ndata = ndat1+ndat2;
wolffd@0: gmix.centres =  [0.3 0.3; 0.7 0.7]; 
wolffd@0: gmix.covars = [0.01 0.01];
wolffd@0: x = gmmsamp(gmix, ndata);
wolffd@0: 
wolffd@0: h = figure;
wolffd@0: hd = plot(x(:, 1), x(:, 2), '.g', 'markersize', 30);
wolffd@0: hold on; axis([0 1 0 1]); axis square; set(gca, 'box', 'on');
wolffd@0: ht = text(0.5, 1.05, 'Data', 'horizontalalignment', 'center');
wolffd@0: disp(' ');
wolffd@0: disp('Press any key to continue.')
wolffd@0: pause; clc;
wolffd@0: 
wolffd@0: disp('We next create and initialize a mixture model consisting of a mixture')
wolffd@0: disp('of 2 Gaussians having ''spherical'' covariance matrices, using the')
wolffd@0: disp('function GMM. The Gaussian components can be displayed on the same')
wolffd@0: disp('plot as the data by drawing a contour of constant probability density')
wolffd@0: disp('for each component having radius equal to the corresponding standard')
wolffd@0: disp('deviation. Component 1 is coloured red and component 2 is coloured')
wolffd@0: disp('blue.')
wolffd@0: disp(' ')
wolffd@0: disp('Note that a particulary poor choice of initial parameters has been')
wolffd@0: disp('made in order to illustrate more effectively the operation of the')
wolffd@0: disp('EM algorithm.')
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to see the initial configuration of the mixture model.')
wolffd@0: pause;
wolffd@0: 
wolffd@0: % Set up mixture model
wolffd@0: ncentres = 2; input_dim = 2;
wolffd@0: mix = gmm(input_dim, ncentres, 'spherical');
wolffd@0: 
wolffd@0: % Initialise the mixture model
wolffd@0: mix.centres = [0.2 0.8; 0.8, 0.2];
wolffd@0: mix.covars = [0.01 0.01];
wolffd@0: 
wolffd@0: % Plot the initial model
wolffd@0: ncirc = 30; theta = linspace(0, 2*pi, ncirc);
wolffd@0: xs = cos(theta); ys = sin(theta);
wolffd@0: xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs;
wolffd@0: yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys;
wolffd@0: hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r');
wolffd@0: hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b');
wolffd@0: set(ht, 'string', 'Initial Configuration');
wolffd@0: figure(h);
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to continue'); 
wolffd@0: pause; clc;
wolffd@0: 
wolffd@0: disp('Now we adapt the parameters of the mixture model iteratively using the')
wolffd@0: disp('EM algorithm. Each cycle of the EM algorithm consists of an E-step')
wolffd@0: disp('followed by an M-step.  We start with the E-step, which involves the')
wolffd@0: disp('evaluation of the posterior probabilities (responsibilities) which the')
wolffd@0: disp('two components have for each of the data points.')
wolffd@0: disp(' ')
wolffd@0: disp('Since we have labelled the two components using the colours red and')
wolffd@0: disp('blue, a convenient way to indicate the value of a posterior')
wolffd@0: disp('probability for a given data point is to colour the point using a')
wolffd@0: disp('scale ranging from pure red (corresponding to a posterior probability')
wolffd@0: disp('of 1.0 for the red component and 0.0 for the blue component) through')
wolffd@0: disp('to pure blue.')
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to see the result of applying the first E-step.')
wolffd@0: pause;
wolffd@0: 
wolffd@0: % Initial E-step.
wolffd@0: set(ht, 'string', 'E-step');
wolffd@0: post = gmmpost(mix, x);
wolffd@0: dcols = [post(:,1), zeros(ndata, 1), post(:,2)];
wolffd@0: delete(hd); 
wolffd@0: for i = 1 : ndata
wolffd@0:   hd(i) = plot(x(i, 1), x(i, 2), 'color', dcols(i,:), ...
wolffd@0:           'marker', '.', 'markersize', 30);
wolffd@0: end
wolffd@0: figure(h);
wolffd@0: 
wolffd@0: disp(' ');
wolffd@0: disp('Press any key to continue')
wolffd@0: pause; clc;
wolffd@0: 
wolffd@0: disp('Next we perform the corresponding M-step. This involves replacing the')
wolffd@0: disp('centres of the component Gaussians by the corresponding weighted means')
wolffd@0: disp('of the data. Thus the centre of the red component is replaced by the')
wolffd@0: disp('mean of the data set, in which each data point is weighted according to')
wolffd@0: disp('the amount of red ink (corresponding to the responsibility of')
wolffd@0: disp('component 1 for explaining that data point). The variances and mixing')
wolffd@0: disp('proportions of the two components are similarly re-estimated.')
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to see the result of applying the first M-step.')
wolffd@0: pause;
wolffd@0: 
wolffd@0: % M-step.
wolffd@0: set(ht, 'string', 'M-step');
wolffd@0: options = foptions; 
wolffd@0: options(14) = 1; % A single iteration
wolffd@0: options(1) = -1; % Switch off all messages, including warning
wolffd@0: mix = gmmem(mix, x, options);
wolffd@0: delete(hc);
wolffd@0: xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs;
wolffd@0: yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys;
wolffd@0: hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r');
wolffd@0: hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b');
wolffd@0: figure(h);
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to continue')
wolffd@0: pause; clc;
wolffd@0: 
wolffd@0: disp('We can continue making alternate E and M steps until the changes in')
wolffd@0: disp('the log likelihood at each cycle become sufficiently small.')
wolffd@0: disp(' ')
wolffd@0: disp('Press any key to see an animation of a further 9 EM cycles.')
wolffd@0: pause;
wolffd@0: figure(h);
wolffd@0: 
wolffd@0: % Loop over EM iterations.
wolffd@0: numiters = 9;
wolffd@0: for n = 1 : numiters
wolffd@0: 
wolffd@0:   set(ht, 'string', 'E-step');
wolffd@0:   post = gmmpost(mix, x);
wolffd@0:   dcols = [post(:,1), zeros(ndata, 1), post(:,2)];
wolffd@0:   delete(hd); 
wolffd@0:   for i = 1 : ndata
wolffd@0:     hd(i) = plot(x(i, 1), x(i, 2), 'color', dcols(i,:), ...
wolffd@0:                  'marker', '.', 'markersize', 30);
wolffd@0:   end
wolffd@0:   pause(1)
wolffd@0: 
wolffd@0:   set(ht, 'string', 'M-step');
wolffd@0:   [mix, options] = gmmem(mix, x, options);
wolffd@0:   fprintf(1, 'Cycle %4d  Error %11.6f\n', n, options(8));
wolffd@0:   delete(hc);
wolffd@0:   xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs;
wolffd@0:   yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys;
wolffd@0:   hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r');
wolffd@0:   hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b');
wolffd@0:   pause(1)
wolffd@0: 
wolffd@0: end
wolffd@0: 
wolffd@0: disp(' ');
wolffd@0: disp('Press any key to end.')
wolffd@0: pause; clc; close(h); clear all
wolffd@0: