annotate toolboxes/MIRtoolbox1.3.2/MIRToolbox/netlabgmminit.m @ 0:cc4b1211e677 tip

initial commit to HG from Changeset: 646 (e263d8a21543) added further path and more save "camirversion.m"
author Daniel Wolff
date Fri, 19 Aug 2016 13:07:06 +0200
parents
children
rev   line source
Daniel@0 1 function mix = netlabgmminit(mix, x, options)
Daniel@0 2 %GMMINIT Initialises Gaussian mixture model from data
Daniel@0 3 %(Renamed NETLABGMMINIT in MIRtoolbox to avoid conflict with statistics
Daniel@0 4 %toolbox)
Daniel@0 5 % Description
Daniel@0 6 % MIX = GMMINIT(MIX, X, OPTIONS) uses a dataset X to initialise the
Daniel@0 7 % parameters of a Gaussian mixture model defined by the data structure
Daniel@0 8 % MIX. The k-means algorithm is used to determine the centres. The
Daniel@0 9 % priors are computed from the proportion of examples belonging to each
Daniel@0 10 % cluster. The covariance matrices are calculated as the sample
Daniel@0 11 % covariance of the points associated with (i.e. closest to) the
Daniel@0 12 % corresponding centres. For a mixture of PPCA model, the PPCA
Daniel@0 13 % decomposition is calculated for the points closest to a given centre.
Daniel@0 14 % This initialisation can be used as the starting point for training
Daniel@0 15 % the model using the EM algorithm.
Daniel@0 16 %
Daniel@0 17 % See also
Daniel@0 18 % GMM
Daniel@0 19 %
Daniel@0 20
Daniel@0 21 % Copyright (c) Ian T Nabney (1996-2001)
Daniel@0 22
Daniel@0 23 [ndata, xdim] = size(x);
Daniel@0 24
Daniel@0 25 % Check that inputs are consistent
Daniel@0 26 errstring = consist(mix, 'gmm', x);
Daniel@0 27 if ~isempty(errstring)
Daniel@0 28 error(errstring);
Daniel@0 29 end
Daniel@0 30
Daniel@0 31 % Arbitrary width used if variance collapses to zero: make it 'large' so
Daniel@0 32 % that centre is responsible for a reasonable number of points.
Daniel@0 33 GMM_WIDTH = 1.0;
Daniel@0 34
Daniel@0 35 % Use kmeans algorithm to set centres
Daniel@0 36 options(5) = 1;
Daniel@0 37 [mix.centres, options, post] = netlabkmeans(mix.centres, x, options);
Daniel@0 38
Daniel@0 39 % Set priors depending on number of points in each cluster
Daniel@0 40 cluster_sizes = max(sum(post, 1), 1); % Make sure that no prior is zero
Daniel@0 41 mix.priors = cluster_sizes/sum(cluster_sizes); % Normalise priors
Daniel@0 42
Daniel@0 43 switch mix.covar_type
Daniel@0 44 case 'spherical'
Daniel@0 45 if mix.ncentres > 1
Daniel@0 46 % Determine widths as distance to nearest centre
Daniel@0 47 % (or a constant if this is zero)
Daniel@0 48 cdist = dist2(mix.centres, mix.centres);
Daniel@0 49 cdist = cdist + diag(ones(mix.ncentres, 1)*realmax);
Daniel@0 50 mix.covars = min(cdist);
Daniel@0 51 mix.covars = mix.covars + GMM_WIDTH*(mix.covars < eps);
Daniel@0 52 else
Daniel@0 53 % Just use variance of all data points averaged over all
Daniel@0 54 % dimensions
Daniel@0 55 mix.covars = mean(diag(cov(x)));
Daniel@0 56 end
Daniel@0 57 case 'diag'
Daniel@0 58 for j = 1:mix.ncentres
Daniel@0 59 % Pick out data points belonging to this centre
Daniel@0 60 c = x(find(post(:, j)),:);
Daniel@0 61 diffs = c - (ones(size(c, 1), 1) * mix.centres(j, :));
Daniel@0 62 mix.covars(j, :) = sum((diffs.*diffs), 1)/size(c, 1);
Daniel@0 63 % Replace small entries by GMM_WIDTH value
Daniel@0 64 mix.covars(j, :) = mix.covars(j, :) + GMM_WIDTH.*(mix.covars(j, :)<eps);
Daniel@0 65 end
Daniel@0 66 case 'full'
Daniel@0 67 for j = 1:mix.ncentres
Daniel@0 68 % Pick out data points belonging to this centre
Daniel@0 69 c = x(find(post(:, j)),:);
Daniel@0 70 diffs = c - (ones(size(c, 1), 1) * mix.centres(j, :));
Daniel@0 71 mix.covars(:,:,j) = (diffs'*diffs)/(size(c, 1));
Daniel@0 72 % Add GMM_WIDTH*Identity to rank-deficient covariance matrices
Daniel@0 73 if rank(mix.covars(:,:,j)) < mix.nin
Daniel@0 74 mix.covars(:,:,j) = mix.covars(:,:,j) + GMM_WIDTH.*eye(mix.nin);
Daniel@0 75 end
Daniel@0 76 end
Daniel@0 77 case 'ppca'
Daniel@0 78 for j = 1:mix.ncentres
Daniel@0 79 % Pick out data points belonging to this centre
Daniel@0 80 c = x(find(post(:,j)),:);
Daniel@0 81 diffs = c - (ones(size(c, 1), 1) * mix.centres(j, :));
Daniel@0 82 [tempcovars, tempU, templambda] = ...
Daniel@0 83 ppca((diffs'*diffs)/size(c, 1), mix.ppca_dim);
Daniel@0 84 if length(templambda) ~= mix.ppca_dim
Daniel@0 85 error('Unable to extract enough components');
Daniel@0 86 else
Daniel@0 87 mix.covars(j) = tempcovars;
Daniel@0 88 mix.U(:, :, j) = tempU;
Daniel@0 89 mix.lambda(j, :) = templambda;
Daniel@0 90 end
Daniel@0 91 end
Daniel@0 92 otherwise
Daniel@0 93 error(['Unknown covariance type ', mix.covar_type]);
Daniel@0 94 end
Daniel@0 95