wolffd@0: function net = mlp(nin, nhidden, nout, outfunc, prior, beta) wolffd@0: %MLP Create a 2-layer feedforward network. wolffd@0: % wolffd@0: % Description wolffd@0: % NET = MLP(NIN, NHIDDEN, NOUT, FUNC) takes the number of inputs, wolffd@0: % hidden units and output units for a 2-layer feed-forward network, wolffd@0: % together with a string FUNC which specifies the output unit wolffd@0: % activation function, and returns a data structure NET. The weights wolffd@0: % are drawn from a zero mean, unit variance isotropic Gaussian, with wolffd@0: % varianced scaled by the fan-in of the hidden or output units as wolffd@0: % appropriate. This makes use of the Matlab function RANDN and so the wolffd@0: % seed for the random weight initialization can be set using wolffd@0: % RANDN('STATE', S) where S is the seed value. The hidden units use wolffd@0: % the TANH activation function. wolffd@0: % wolffd@0: % The fields in NET are wolffd@0: % type = 'mlp' wolffd@0: % nin = number of inputs wolffd@0: % nhidden = number of hidden units wolffd@0: % nout = number of outputs wolffd@0: % nwts = total number of weights and biases wolffd@0: % actfn = string describing the output unit activation function: wolffd@0: % 'linear' wolffd@0: % 'logistic wolffd@0: % 'softmax' wolffd@0: % w1 = first-layer weight matrix wolffd@0: % b1 = first-layer bias vector wolffd@0: % w2 = second-layer weight matrix wolffd@0: % b2 = second-layer bias vector wolffd@0: % Here W1 has dimensions NIN times NHIDDEN, B1 has dimensions 1 times wolffd@0: % NHIDDEN, W2 has dimensions NHIDDEN times NOUT, and B2 has dimensions wolffd@0: % 1 times NOUT. wolffd@0: % wolffd@0: % NET = MLP(NIN, NHIDDEN, NOUT, FUNC, PRIOR), in which PRIOR is a wolffd@0: % scalar, allows the field NET.ALPHA in the data structure NET to be wolffd@0: % set, corresponding to a zero-mean isotropic Gaussian prior with wolffd@0: % inverse variance with value PRIOR. Alternatively, PRIOR can consist wolffd@0: % of a data structure with fields ALPHA and INDEX, allowing individual wolffd@0: % Gaussian priors to be set over groups of weights in the network. Here wolffd@0: % ALPHA is a column vector in which each element corresponds to a wolffd@0: % separate group of weights, which need not be mutually exclusive. The wolffd@0: % membership of the groups is defined by the matrix INDX in which the wolffd@0: % columns correspond to the elements of ALPHA. Each column has one wolffd@0: % element for each weight in the matrix, in the order defined by the wolffd@0: % function MLPPAK, and each element is 1 or 0 according to whether the wolffd@0: % weight is a member of the corresponding group or not. A utility wolffd@0: % function MLPPRIOR is provided to help in setting up the PRIOR data wolffd@0: % structure. wolffd@0: % wolffd@0: % NET = MLP(NIN, NHIDDEN, NOUT, FUNC, PRIOR, BETA) also sets the wolffd@0: % additional field NET.BETA in the data structure NET, where beta wolffd@0: % corresponds to the inverse noise variance. wolffd@0: % wolffd@0: % See also wolffd@0: % MLPPRIOR, MLPPAK, MLPUNPAK, MLPFWD, MLPERR, MLPBKP, MLPGRAD wolffd@0: % wolffd@0: wolffd@0: % Copyright (c) Ian T Nabney (1996-2001) wolffd@0: wolffd@0: net.type = 'mlp'; wolffd@0: net.nin = nin; wolffd@0: net.nhidden = nhidden; wolffd@0: net.nout = nout; wolffd@0: net.nwts = (nin + 1)*nhidden + (nhidden + 1)*nout; wolffd@0: wolffd@0: outfns = {'linear', 'logistic', 'softmax'}; wolffd@0: wolffd@0: if sum(strcmp(outfunc, outfns)) == 0 wolffd@0: error('Undefined output function. Exiting.'); wolffd@0: else wolffd@0: net.outfn = outfunc; wolffd@0: end wolffd@0: wolffd@0: if nargin > 4 wolffd@0: if isstruct(prior) wolffd@0: net.alpha = prior.alpha; wolffd@0: net.index = prior.index; wolffd@0: elseif size(prior) == [1 1] wolffd@0: net.alpha = prior; wolffd@0: else wolffd@0: error('prior must be a scalar or a structure'); wolffd@0: end wolffd@0: end wolffd@0: wolffd@0: net.w1 = randn(nin, nhidden)/sqrt(nin + 1); wolffd@0: net.b1 = randn(1, nhidden)/sqrt(nin + 1); wolffd@0: net.w2 = randn(nhidden, nout)/sqrt(nhidden + 1); wolffd@0: net.b2 = randn(1, nout)/sqrt(nhidden + 1); wolffd@0: wolffd@0: if nargin == 6 wolffd@0: net.beta = beta; wolffd@0: end