Daniel@0: function [net, options, errlog] = gtmem(net, t, options) Daniel@0: %GTMEM EM algorithm for Generative Topographic Mapping. Daniel@0: % Daniel@0: % Description Daniel@0: % [NET, OPTIONS, ERRLOG] = GTMEM(NET, T, OPTIONS) uses the Expectation Daniel@0: % Maximization algorithm to estimate the parameters of a GTM defined by Daniel@0: % a data structure NET. The matrix T represents the data whose Daniel@0: % expectation is maximized, with each row corresponding to a vector. Daniel@0: % It is assumed that the latent data NET.X has been set following a Daniel@0: % call to GTMINIT, for example. The optional parameters have the Daniel@0: % following interpretations. Daniel@0: % Daniel@0: % OPTIONS(1) is set to 1 to display error values; also logs error Daniel@0: % values in the return argument ERRLOG. If OPTIONS(1) is set to 0, then Daniel@0: % only warning messages are displayed. If OPTIONS(1) is -1, then Daniel@0: % nothing is displayed. Daniel@0: % Daniel@0: % OPTIONS(3) is a measure of the absolute precision required of the Daniel@0: % error function at the solution. If the change in log likelihood Daniel@0: % between two steps of the EM algorithm is less than this value, then Daniel@0: % the function terminates. Daniel@0: % Daniel@0: % OPTIONS(14) is the maximum number of iterations; default 100. Daniel@0: % Daniel@0: % The optional return value OPTIONS contains the final error value Daniel@0: % (i.e. data log likelihood) in OPTIONS(8). Daniel@0: % Daniel@0: % See also Daniel@0: % GTM, GTMINIT Daniel@0: % Daniel@0: Daniel@0: % Copyright (c) Ian T Nabney (1996-2001) Daniel@0: Daniel@0: % Check that inputs are consistent Daniel@0: errstring = consist(net, 'gtm', t); Daniel@0: if ~isempty(errstring) Daniel@0: error(errstring); Daniel@0: end Daniel@0: Daniel@0: % Sort out the options Daniel@0: if (options(14)) Daniel@0: niters = options(14); Daniel@0: else Daniel@0: niters = 100; Daniel@0: end Daniel@0: Daniel@0: display = options(1); Daniel@0: store = 0; Daniel@0: if (nargout > 2) Daniel@0: store = 1; % Store the error values to return them Daniel@0: errlog = zeros(1, niters); Daniel@0: end Daniel@0: test = 0; Daniel@0: if options(3) > 0.0 Daniel@0: test = 1; % Test log likelihood for termination Daniel@0: end Daniel@0: Daniel@0: % Calculate various quantities that remain constant during training Daniel@0: [ndata, tdim] = size(t); Daniel@0: ND = ndata*tdim; Daniel@0: [net.gmmnet.centres, Phi] = rbffwd(net.rbfnet, net.X); Daniel@0: Phi = [Phi ones(size(net.X, 1), 1)]; Daniel@0: PhiT = Phi'; Daniel@0: [K, Mplus1] = size(Phi); Daniel@0: Daniel@0: A = zeros(Mplus1, Mplus1); Daniel@0: cholDcmp = zeros(Mplus1, Mplus1); Daniel@0: % Use a sparse representation for the weight regularizing matrix. Daniel@0: if (net.rbfnet.alpha > 0) Daniel@0: Alpha = net.rbfnet.alpha*speye(Mplus1); Daniel@0: Alpha(Mplus1, Mplus1) = 0; Daniel@0: end Daniel@0: Daniel@0: for n = 1:niters Daniel@0: % Calculate responsibilities Daniel@0: [R, act] = gtmpost(net, t); Daniel@0: % Calculate error value if needed Daniel@0: if (display | store | test) Daniel@0: prob = act*(net.gmmnet.priors)'; Daniel@0: % Error value is negative log likelihood of data Daniel@0: e = - sum(log(max(prob,eps))); Daniel@0: if store Daniel@0: errlog(n) = e; Daniel@0: end Daniel@0: if display > 0 Daniel@0: fprintf(1, 'Cycle %4d Error %11.6f\n', n, e); Daniel@0: end Daniel@0: if test Daniel@0: if (n > 1 & abs(e - eold) < options(3)) Daniel@0: options(8) = e; Daniel@0: return; Daniel@0: else Daniel@0: eold = e; Daniel@0: end Daniel@0: end Daniel@0: end Daniel@0: Daniel@0: % Calculate matrix be inverted (Phi'*G*Phi + alpha*I in the papers). Daniel@0: % Sparse representation of G normally executes faster and saves Daniel@0: % memory Daniel@0: if (net.rbfnet.alpha > 0) Daniel@0: A = full(PhiT*spdiags(sum(R)', 0, K, K)*Phi + ... Daniel@0: (Alpha.*net.gmmnet.covars(1))); Daniel@0: else Daniel@0: A = full(PhiT*spdiags(sum(R)', 0, K, K)*Phi); Daniel@0: end Daniel@0: % A is a symmetric matrix likely to be positive definite, so try Daniel@0: % fast Cholesky decomposition to calculate W, otherwise use SVD. Daniel@0: % (PhiT*(R*t)) is computed right-to-left, as R Daniel@0: % and t are normally (much) larger than PhiT. Daniel@0: [cholDcmp singular] = chol(A); Daniel@0: if (singular) Daniel@0: if (display) Daniel@0: fprintf(1, ... Daniel@0: 'gtmem: Warning -- M-Step matrix singular, using pinv.\n'); Daniel@0: end Daniel@0: W = pinv(A)*(PhiT*(R'*t)); Daniel@0: else Daniel@0: W = cholDcmp \ (cholDcmp' \ (PhiT*(R'*t))); Daniel@0: end Daniel@0: % Put new weights into network to calculate responsibilities Daniel@0: % net.rbfnet = netunpak(net.rbfnet, W); Daniel@0: net.rbfnet.w2 = W(1:net.rbfnet.nhidden, :); Daniel@0: net.rbfnet.b2 = W(net.rbfnet.nhidden+1, :); Daniel@0: % Calculate new distances Daniel@0: d = dist2(t, Phi*W); Daniel@0: Daniel@0: % Calculate new value for beta Daniel@0: net.gmmnet.covars = ones(1, net.gmmnet.ncentres)*(sum(sum(d.*R))/ND); Daniel@0: end Daniel@0: Daniel@0: options(8) = -sum(log(gtmprob(net, t))); Daniel@0: if (display >= 0) Daniel@0: disp(maxitmess); Daniel@0: end