annotate toolboxes/FullBNT-1.0.7/netlab3.3/rbftrain.m @ 0:cc4b1211e677 tip

initial commit to HG from Changeset: 646 (e263d8a21543) added further path and more save "camirversion.m"
author Daniel Wolff
date Fri, 19 Aug 2016 13:07:06 +0200
parents
children
rev   line source
Daniel@0 1 function [net, options] = rbftrain(net, options, x, t)
Daniel@0 2 %RBFTRAIN Two stage training of RBF network.
Daniel@0 3 %
Daniel@0 4 % Description
Daniel@0 5 % NET = RBFTRAIN(NET, OPTIONS, X, T) uses a two stage training
Daniel@0 6 % algorithm to set the weights in the RBF model structure NET. Each row
Daniel@0 7 % of X corresponds to one input vector and each row of T contains the
Daniel@0 8 % corresponding target vector. The centres are determined by fitting a
Daniel@0 9 % Gaussian mixture model with circular covariances using the EM
Daniel@0 10 % algorithm through a call to RBFSETBF. (The mixture model is
Daniel@0 11 % initialised using a small number of iterations of the K-means
Daniel@0 12 % algorithm.) If the activation functions are Gaussians, then the basis
Daniel@0 13 % function widths are then set to the maximum inter-centre squared
Daniel@0 14 % distance.
Daniel@0 15 %
Daniel@0 16 % For linear outputs, the hidden to output weights that give rise to
Daniel@0 17 % the least squares solution can then be determined using the pseudo-
Daniel@0 18 % inverse. For neuroscale outputs, the hidden to output weights are
Daniel@0 19 % determined using the iterative shadow targets algorithm. Although
Daniel@0 20 % this two stage procedure may not give solutions with as low an error
Daniel@0 21 % as using general purpose non-linear optimisers, it is much faster.
Daniel@0 22 %
Daniel@0 23 % The options vector may have two rows: if this is the case, then the
Daniel@0 24 % second row is passed to RBFSETBF, which allows the user to specify a
Daniel@0 25 % different number iterations for RBF and GMM training. The optional
Daniel@0 26 % parameters to RBFTRAIN have the following interpretations.
Daniel@0 27 %
Daniel@0 28 % OPTIONS(1) is set to 1 to display error values during EM training.
Daniel@0 29 %
Daniel@0 30 % OPTIONS(2) is a measure of the precision required for the value of
Daniel@0 31 % the weights W at the solution.
Daniel@0 32 %
Daniel@0 33 % OPTIONS(3) is a measure of the precision required of the objective
Daniel@0 34 % function at the solution. Both this and the previous condition must
Daniel@0 35 % be satisfied for termination.
Daniel@0 36 %
Daniel@0 37 % OPTIONS(5) is set to 1 if the basis functions parameters should
Daniel@0 38 % remain unchanged; default 0.
Daniel@0 39 %
Daniel@0 40 % OPTIONS(6) is set to 1 if the output layer weights should be should
Daniel@0 41 % set using PCA. This is only relevant for Neuroscale outputs; default
Daniel@0 42 % 0.
Daniel@0 43 %
Daniel@0 44 % OPTIONS(14) is the maximum number of iterations for the shadow
Daniel@0 45 % targets algorithm; default 100.
Daniel@0 46 %
Daniel@0 47 % See also
Daniel@0 48 % RBF, RBFERR, RBFFWD, RBFGRAD, RBFPAK, RBFUNPAK, RBFSETBF
Daniel@0 49 %
Daniel@0 50
Daniel@0 51 % Copyright (c) Ian T Nabney (1996-2001)
Daniel@0 52
Daniel@0 53 % Check arguments for consistency
Daniel@0 54 switch net.outfn
Daniel@0 55 case 'linear'
Daniel@0 56 errstring = consist(net, 'rbf', x, t);
Daniel@0 57 case 'neuroscale'
Daniel@0 58 errstring = consist(net, 'rbf', x);
Daniel@0 59 otherwise
Daniel@0 60 error(['Unknown output function ', net.outfn]);
Daniel@0 61 end
Daniel@0 62 if ~isempty(errstring)
Daniel@0 63 error(errstring);
Daniel@0 64 end
Daniel@0 65
Daniel@0 66 % Allow options to have two rows: if this is the case, then the second row
Daniel@0 67 % is passed to rbfsetbf
Daniel@0 68 if size(options, 1) == 2
Daniel@0 69 setbfoptions = options(2, :);
Daniel@0 70 options = options(1, :);
Daniel@0 71 else
Daniel@0 72 setbfoptions = options;
Daniel@0 73 end
Daniel@0 74
Daniel@0 75 if(~options(14))
Daniel@0 76 options(14) = 100;
Daniel@0 77 end
Daniel@0 78 % Do we need to test for termination?
Daniel@0 79 test = (options(2) | options(3));
Daniel@0 80
Daniel@0 81 % Set up the basis function parameters to model the input data density
Daniel@0 82 % unless options(5) is set.
Daniel@0 83 if ~(logical(options(5)))
Daniel@0 84 net = rbfsetbf(net, setbfoptions, x);
Daniel@0 85 end
Daniel@0 86
Daniel@0 87 % Compute the design (or activations) matrix
Daniel@0 88 [y, act] = rbffwd(net, x);
Daniel@0 89 ndata = size(x, 1);
Daniel@0 90
Daniel@0 91 if strcmp(net.outfn, 'neuroscale') & options(6)
Daniel@0 92 % Initialise output layer weights by projecting data with PCA
Daniel@0 93 mu = mean(x);
Daniel@0 94 [pcvals, pcvecs] = pca(x, net.nout);
Daniel@0 95 xproj = (x - ones(ndata, 1)*mu)*pcvecs;
Daniel@0 96 % Now use projected data as targets to compute output layer weights
Daniel@0 97 temp = pinv([act ones(ndata, 1)]) * xproj;
Daniel@0 98 net.w2 = temp(1:net.nhidden, :);
Daniel@0 99 net.b2 = temp(net.nhidden+1, :);
Daniel@0 100 % Propagate again to compute revised outputs
Daniel@0 101 [y, act] = rbffwd(net, x);
Daniel@0 102 end
Daniel@0 103
Daniel@0 104 switch net.outfn
Daniel@0 105 case 'linear'
Daniel@0 106 % Sum of squares error function in regression model
Daniel@0 107 % Solve for the weights and biases using pseudo-inverse from activations
Daniel@0 108 Phi = [act ones(ndata, 1)];
Daniel@0 109 if ~isfield(net, 'alpha')
Daniel@0 110 % Solve for the weights and biases using left matrix divide
Daniel@0 111 temp = pinv(Phi)*t;
Daniel@0 112 elseif size(net.alpha == [1 1])
Daniel@0 113 % Use normal form equation
Daniel@0 114 hessian = Phi'*Phi + net.alpha*eye(net.nhidden+1);
Daniel@0 115 temp = pinv(hessian)*(Phi'*t);
Daniel@0 116 else
Daniel@0 117 error('Only scalar alpha allowed');
Daniel@0 118 end
Daniel@0 119 net.w2 = temp(1:net.nhidden, :);
Daniel@0 120 net.b2 = temp(net.nhidden+1, :);
Daniel@0 121
Daniel@0 122 case 'neuroscale'
Daniel@0 123 % Use the shadow targets training algorithm
Daniel@0 124 if nargin < 4
Daniel@0 125 % If optional input distances not passed in, then use
Daniel@0 126 % Euclidean distance
Daniel@0 127 x_dist = sqrt(dist2(x, x));
Daniel@0 128 else
Daniel@0 129 x_dist = t;
Daniel@0 130 end
Daniel@0 131 Phi = [act, ones(ndata, 1)];
Daniel@0 132 % Compute the pseudo-inverse of Phi
Daniel@0 133 PhiDag = pinv(Phi);
Daniel@0 134 % Compute y_dist, distances between image points
Daniel@0 135 y_dist = sqrt(dist2(y, y));
Daniel@0 136
Daniel@0 137 % Save old weights so that we can check the termination criterion
Daniel@0 138 wold = netpak(net);
Daniel@0 139 % Compute initial error (stress) value
Daniel@0 140 errold = 0.5*(sum(sum((x_dist - y_dist).^2)));
Daniel@0 141
Daniel@0 142 % Initial value for eta
Daniel@0 143 eta = 0.1;
Daniel@0 144 k_up = 1.2;
Daniel@0 145 k_down = 0.1;
Daniel@0 146 success = 1; % Force initial gradient calculation
Daniel@0 147
Daniel@0 148 for j = 1:options(14)
Daniel@0 149 if success
Daniel@0 150 % Compute the negative error gradient with respect to network outputs
Daniel@0 151 D = (x_dist - y_dist)./(y_dist+(y_dist==0));
Daniel@0 152 temp = y';
Daniel@0 153 neg_gradient = -2.*sum(kron(D, ones(1, net.nout)) .* ...
Daniel@0 154 (repmat(y, 1, ndata) - repmat((temp(:))', ndata, 1)), 1);
Daniel@0 155 neg_gradient = (reshape(neg_gradient, net.nout, ndata))';
Daniel@0 156 end
Daniel@0 157 % Compute the shadow targets
Daniel@0 158 t = y + eta*neg_gradient;
Daniel@0 159 % Solve for the weights and biases
Daniel@0 160 temp = PhiDag * t;
Daniel@0 161 net.w2 = temp(1:net.nhidden, :);
Daniel@0 162 net.b2 = temp(net.nhidden+1, :);
Daniel@0 163
Daniel@0 164 % Do housekeeping and test for convergence
Daniel@0 165 ynew = rbffwd(net, x);
Daniel@0 166 y_distnew = sqrt(dist2(ynew, ynew));
Daniel@0 167 err = 0.5.*(sum(sum((x_dist-y_distnew).^2)));
Daniel@0 168 if err > errold
Daniel@0 169 success = 0;
Daniel@0 170 % Restore previous weights
Daniel@0 171 net = netunpak(net, wold);
Daniel@0 172 err = errold;
Daniel@0 173 eta = eta * k_down;
Daniel@0 174 else
Daniel@0 175 success = 1;
Daniel@0 176 eta = eta * k_up;
Daniel@0 177 errold = err;
Daniel@0 178 y = ynew;
Daniel@0 179 y_dist = y_distnew;
Daniel@0 180 if test & j > 1
Daniel@0 181 w = netpak(net);
Daniel@0 182 if (max(abs(w - wold)) < options(2) & abs(err-errold) < options(3))
Daniel@0 183 options(8) = err;
Daniel@0 184 return;
Daniel@0 185 end
Daniel@0 186 end
Daniel@0 187 wold = netpak(net);
Daniel@0 188 end
Daniel@0 189 if options(1)
Daniel@0 190 fprintf(1, 'Cycle %4d Error %11.6f\n', j, err)
Daniel@0 191 end
Daniel@0 192 if nargout >= 3
Daniel@0 193 errlog(j) = err;
Daniel@0 194 end
Daniel@0 195 end
Daniel@0 196 options(8) = errold;
Daniel@0 197 if (options(1) >= 0)
Daniel@0 198 disp('Warning: Maximum number of iterations has been exceeded');
Daniel@0 199 end
Daniel@0 200 otherwise
Daniel@0 201 error(['Unknown output function ', net.outfn]);
Daniel@0 202
Daniel@0 203 end