wolffd@0: %DEMGMM1 Demonstrate EM for Gaussian mixtures. wolffd@0: % wolffd@0: % Description wolffd@0: % This script demonstrates the use of the EM algorithm to fit a mixture wolffd@0: % of Gaussians to a set of data using maximum likelihood. A colour wolffd@0: % coding scheme is used to illustrate the evaluation of the posterior wolffd@0: % probabilities in the E-step of the EM algorithm. wolffd@0: % wolffd@0: % See also wolffd@0: % DEMGMM2, DEMGMM3, DEMGMM4, GMM, GMMEM, GMMPOST wolffd@0: % wolffd@0: wolffd@0: % Copyright (c) Ian T Nabney (1996-2001) wolffd@0: wolffd@0: clc; wolffd@0: disp('This demonstration illustrates the use of the EM (expectation-') wolffd@0: disp('maximization) algorithm for fitting of a mixture of Gaussians to a') wolffd@0: disp('data set by maximum likelihood.') wolffd@0: disp(' ') wolffd@0: disp('The data set consists of 40 data points in a 2-dimensional') wolffd@0: disp('space, generated by sampling from a mixture of 2 Gaussian') wolffd@0: disp('distributions.') wolffd@0: disp(' ') wolffd@0: disp('Press any key to see a plot of the data.') wolffd@0: pause; wolffd@0: wolffd@0: % Generate the data wolffd@0: randn('state', 0); rand('state', 0); wolffd@0: gmix = gmm(2, 2, 'spherical'); wolffd@0: ndat1 = 20; ndat2 = 20; ndata = ndat1+ndat2; wolffd@0: gmix.centres = [0.3 0.3; 0.7 0.7]; wolffd@0: gmix.covars = [0.01 0.01]; wolffd@0: x = gmmsamp(gmix, ndata); wolffd@0: wolffd@0: h = figure; wolffd@0: hd = plot(x(:, 1), x(:, 2), '.g', 'markersize', 30); wolffd@0: hold on; axis([0 1 0 1]); axis square; set(gca, 'box', 'on'); wolffd@0: ht = text(0.5, 1.05, 'Data', 'horizontalalignment', 'center'); wolffd@0: disp(' '); wolffd@0: disp('Press any key to continue.') wolffd@0: pause; clc; wolffd@0: wolffd@0: disp('We next create and initialize a mixture model consisting of a mixture') wolffd@0: disp('of 2 Gaussians having ''spherical'' covariance matrices, using the') wolffd@0: disp('function GMM. The Gaussian components can be displayed on the same') wolffd@0: disp('plot as the data by drawing a contour of constant probability density') wolffd@0: disp('for each component having radius equal to the corresponding standard') wolffd@0: disp('deviation. Component 1 is coloured red and component 2 is coloured') wolffd@0: disp('blue.') wolffd@0: disp(' ') wolffd@0: disp('Note that a particulary poor choice of initial parameters has been') wolffd@0: disp('made in order to illustrate more effectively the operation of the') wolffd@0: disp('EM algorithm.') wolffd@0: disp(' ') wolffd@0: disp('Press any key to see the initial configuration of the mixture model.') wolffd@0: pause; wolffd@0: wolffd@0: % Set up mixture model wolffd@0: ncentres = 2; input_dim = 2; wolffd@0: mix = gmm(input_dim, ncentres, 'spherical'); wolffd@0: wolffd@0: % Initialise the mixture model wolffd@0: mix.centres = [0.2 0.8; 0.8, 0.2]; wolffd@0: mix.covars = [0.01 0.01]; wolffd@0: wolffd@0: % Plot the initial model wolffd@0: ncirc = 30; theta = linspace(0, 2*pi, ncirc); wolffd@0: xs = cos(theta); ys = sin(theta); wolffd@0: xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs; wolffd@0: yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys; wolffd@0: hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r'); wolffd@0: hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b'); wolffd@0: set(ht, 'string', 'Initial Configuration'); wolffd@0: figure(h); wolffd@0: disp(' ') wolffd@0: disp('Press any key to continue'); wolffd@0: pause; clc; wolffd@0: wolffd@0: disp('Now we adapt the parameters of the mixture model iteratively using the') wolffd@0: disp('EM algorithm. Each cycle of the EM algorithm consists of an E-step') wolffd@0: disp('followed by an M-step. We start with the E-step, which involves the') wolffd@0: disp('evaluation of the posterior probabilities (responsibilities) which the') wolffd@0: disp('two components have for each of the data points.') wolffd@0: disp(' ') wolffd@0: disp('Since we have labelled the two components using the colours red and') wolffd@0: disp('blue, a convenient way to indicate the value of a posterior') wolffd@0: disp('probability for a given data point is to colour the point using a') wolffd@0: disp('scale ranging from pure red (corresponding to a posterior probability') wolffd@0: disp('of 1.0 for the red component and 0.0 for the blue component) through') wolffd@0: disp('to pure blue.') wolffd@0: disp(' ') wolffd@0: disp('Press any key to see the result of applying the first E-step.') wolffd@0: pause; wolffd@0: wolffd@0: % Initial E-step. wolffd@0: set(ht, 'string', 'E-step'); wolffd@0: post = gmmpost(mix, x); wolffd@0: dcols = [post(:,1), zeros(ndata, 1), post(:,2)]; wolffd@0: delete(hd); wolffd@0: for i = 1 : ndata wolffd@0: hd(i) = plot(x(i, 1), x(i, 2), 'color', dcols(i,:), ... wolffd@0: 'marker', '.', 'markersize', 30); wolffd@0: end wolffd@0: figure(h); wolffd@0: wolffd@0: disp(' '); wolffd@0: disp('Press any key to continue') wolffd@0: pause; clc; wolffd@0: wolffd@0: disp('Next we perform the corresponding M-step. This involves replacing the') wolffd@0: disp('centres of the component Gaussians by the corresponding weighted means') wolffd@0: disp('of the data. Thus the centre of the red component is replaced by the') wolffd@0: disp('mean of the data set, in which each data point is weighted according to') wolffd@0: disp('the amount of red ink (corresponding to the responsibility of') wolffd@0: disp('component 1 for explaining that data point). The variances and mixing') wolffd@0: disp('proportions of the two components are similarly re-estimated.') wolffd@0: disp(' ') wolffd@0: disp('Press any key to see the result of applying the first M-step.') wolffd@0: pause; wolffd@0: wolffd@0: % M-step. wolffd@0: set(ht, 'string', 'M-step'); wolffd@0: options = foptions; wolffd@0: options(14) = 1; % A single iteration wolffd@0: options(1) = -1; % Switch off all messages, including warning wolffd@0: mix = gmmem(mix, x, options); wolffd@0: delete(hc); wolffd@0: xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs; wolffd@0: yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys; wolffd@0: hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r'); wolffd@0: hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b'); wolffd@0: figure(h); wolffd@0: disp(' ') wolffd@0: disp('Press any key to continue') wolffd@0: pause; clc; wolffd@0: wolffd@0: disp('We can continue making alternate E and M steps until the changes in') wolffd@0: disp('the log likelihood at each cycle become sufficiently small.') wolffd@0: disp(' ') wolffd@0: disp('Press any key to see an animation of a further 9 EM cycles.') wolffd@0: pause; wolffd@0: figure(h); wolffd@0: wolffd@0: % Loop over EM iterations. wolffd@0: numiters = 9; wolffd@0: for n = 1 : numiters wolffd@0: wolffd@0: set(ht, 'string', 'E-step'); wolffd@0: post = gmmpost(mix, x); wolffd@0: dcols = [post(:,1), zeros(ndata, 1), post(:,2)]; wolffd@0: delete(hd); wolffd@0: for i = 1 : ndata wolffd@0: hd(i) = plot(x(i, 1), x(i, 2), 'color', dcols(i,:), ... wolffd@0: 'marker', '.', 'markersize', 30); wolffd@0: end wolffd@0: pause(1) wolffd@0: wolffd@0: set(ht, 'string', 'M-step'); wolffd@0: [mix, options] = gmmem(mix, x, options); wolffd@0: fprintf(1, 'Cycle %4d Error %11.6f\n', n, options(8)); wolffd@0: delete(hc); wolffd@0: xvals = mix.centres(:, 1)*ones(1,ncirc) + sqrt(mix.covars')*xs; wolffd@0: yvals = mix.centres(:, 2)*ones(1,ncirc) + sqrt(mix.covars')*ys; wolffd@0: hc(1)=line(xvals(1,:), yvals(1,:), 'color', 'r'); wolffd@0: hc(2)=line(xvals(2,:), yvals(2,:), 'color', 'b'); wolffd@0: pause(1) wolffd@0: wolffd@0: end wolffd@0: wolffd@0: disp(' '); wolffd@0: disp('Press any key to end.') wolffd@0: pause; clc; close(h); clear all wolffd@0: