camir-aes2014: toolboxes/FullBNT-1.0.7/netlab3.3/demev3.m annotate

annotate toolboxes/FullBNT-1.0.7/netlab3.3/demev3.m @ 0:e9a9cd732c1e tip

first hg version after svn

author	wolffd
date	Tue, 10 Feb 2015 15:05:51 +0000
parents
children

rev	line source
wolffd@0	1 %DEMEV3 Demonstrate Bayesian regression for the RBF.
wolffd@0	2 %
wolffd@0	3 % Description
wolffd@0	4 % The problem consists an input variable X which sampled from a
wolffd@0	5 % Gaussian distribution, and a target variable T generated by computing
wolffd@0	6 % SIN(2PIX) and adding Gaussian noise. An RBF network with linear
wolffd@0	7 % outputs is trained by minimizing a sum-of-squares error function with
wolffd@0	8 % isotropic Gaussian regularizer, using the scaled conjugate gradient
wolffd@0	9 % optimizer. The hyperparameters ALPHA and BETA are re-estimated using
wolffd@0	10 % the function EVIDENCE. A graph is plotted of the original function,
wolffd@0	11 % the training data, the trained network function, and the error bars.
wolffd@0	12 %
wolffd@0	13 % See also
wolffd@0	14 % DEMEV1, EVIDENCE, RBF, SCG, NETEVFWD
wolffd@0	15 %
wolffd@0	16
wolffd@0	17 % Copyright (c) Ian T Nabney (1996-2001)
wolffd@0	18
wolffd@0	19 clc;
wolffd@0	20 disp('This demonstration illustrates the application of Bayesian')
wolffd@0	21 disp('re-estimation to determine the hyperparameters in a simple regression')
wolffd@0	22 disp('problem using an RBF netowk. It is based on a the fact that the')
wolffd@0	23 disp('posterior distribution for the output weights of an RBF is Gaussian')
wolffd@0	24 disp('and uses the evidence maximization framework of MacKay.')
wolffd@0	25 disp(' ')
wolffd@0	26 disp('First, we generate a synthetic data set consisting of a single input')
wolffd@0	27 disp('variable x sampled from a Gaussian distribution, and a target variable')
wolffd@0	28 disp('t obtained by evaluating sin(2pix) and adding Gaussian noise.')
wolffd@0	29 disp(' ')
wolffd@0	30 disp('Press any key to see a plot of the data together with the sine function.')
wolffd@0	31 pause;
wolffd@0	32
wolffd@0	33 % Generate the matrix of inputs x and targets t.
wolffd@0	34
wolffd@0	35 ndata = 16; % Number of data points.
wolffd@0	36 noise = 0.1; % Standard deviation of noise distribution.
wolffd@0	37 randn('state', 0);
wolffd@0	38 rand('state', 0);
wolffd@0	39 x = 0.25 + 0.07*randn(ndata, 1);
wolffd@0	40 t = sin(2pix) + noise*randn(size(x));
wolffd@0	41
wolffd@0	42 % Plot the data and the original sine function.
wolffd@0	43 h = figure;
wolffd@0	44 nplot = 200;
wolffd@0	45 plotvals = linspace(0, 1, nplot)';
wolffd@0	46 plot(x, t, 'ok')
wolffd@0	47 xlabel('Input')
wolffd@0	48 ylabel('Target')
wolffd@0	49 hold on
wolffd@0	50 axis([0 1 -1.5 1.5])
wolffd@0	51 fplot('sin(2pix)', [0 1], '-g')
wolffd@0	52 legend('data', 'function');
wolffd@0	53
wolffd@0	54 disp(' ')
wolffd@0	55 disp('Press any key to continue')
wolffd@0	56 pause; clc;
wolffd@0	57
wolffd@0	58 disp('Next we create a two-layer MLP network having 3 hidden units and one')
wolffd@0	59 disp('linear output. The model assumes Gaussian target noise governed by an')
wolffd@0	60 disp('inverse variance hyperparmeter beta, and uses a simple Gaussian prior')
wolffd@0	61 disp('distribution governed by an inverse variance hyperparameter alpha.')
wolffd@0	62 disp(' ');
wolffd@0	63 disp('The network weights and the hyperparameters are initialised and then')
wolffd@0	64 disp('the output layer weights are optimized with the scaled conjugate gradient')
wolffd@0	65 disp('algorithm using the SCG function, with the hyperparameters kept')
wolffd@0	66 disp('fixed. After a maximum of 50 iterations, the hyperparameters are')
wolffd@0	67 disp('re-estimated using the EVIDENCE function. The process of optimizing')
wolffd@0	68 disp('the weights with fixed hyperparameters and then re-estimating the')
wolffd@0	69 disp('hyperparameters is repeated for a total of 3 cycles.')
wolffd@0	70 disp(' ')
wolffd@0	71 disp('Press any key to train the network and determine the hyperparameters.')
wolffd@0	72 pause;
wolffd@0	73
wolffd@0	74 % Set up network parameters.
wolffd@0	75 nin = 1; % Number of inputs.
wolffd@0	76 nhidden = 3; % Number of hidden units.
wolffd@0	77 nout = 1; % Number of outputs.
wolffd@0	78 alpha = 0.01; % Initial prior hyperparameter.
wolffd@0	79 beta_init = 50.0; % Initial noise hyperparameter.
wolffd@0	80
wolffd@0	81 % Create and initialize network weight vector.
wolffd@0	82 net = rbf(nin, nhidden, nout, 'tps', 'linear', alpha, beta_init);
wolffd@0	83 [net.mask, prior] = rbfprior('tps', nin, nhidden, nout, alpha, alpha);
wolffd@0	84 net = netinit(net, prior);
wolffd@0	85
wolffd@0	86 options = foptions;
wolffd@0	87 options(14) = 5; % At most 5 EM iterations for basis functions
wolffd@0	88 options(1) = -1; % Turn off all messages
wolffd@0	89 net = rbfsetbf(net, options, x); % Initialise the basis functions
wolffd@0	90
wolffd@0	91 % Now train the network
wolffd@0	92 nouter = 5;
wolffd@0	93 ninner = 2;
wolffd@0	94 options = foptions;
wolffd@0	95 options(1) = 1;
wolffd@0	96 options(2) = 1.0e-5; % Absolute precision for weights.
wolffd@0	97 options(3) = 1.0e-5; % Precision for objective function.
wolffd@0	98 options(14) = 50; % Number of training cycles in inner loop.
wolffd@0	99
wolffd@0	100 % Train using scaled conjugate gradients, re-estimating alpha and beta.
wolffd@0	101 for k = 1:nouter
wolffd@0	102 net = netopt(net, options, x, t, 'scg');
wolffd@0	103 [net, gamma] = evidence(net, x, t, ninner);
wolffd@0	104 fprintf(1, '\nRe-estimation cycle %d:\n', k);
wolffd@0	105 fprintf(1, ' alpha = %8.5f\n', net.alpha);
wolffd@0	106 fprintf(1, ' beta = %8.5f\n', net.beta);
wolffd@0	107 fprintf(1, ' gamma = %8.5f\n\n', gamma);
wolffd@0	108 disp(' ')
wolffd@0	109 disp('Press any key to continue.')
wolffd@0	110 pause;
wolffd@0	111 end
wolffd@0	112
wolffd@0	113 fprintf(1, 'true beta: %f\n', 1/(noise*noise));
wolffd@0	114
wolffd@0	115 disp(' ')
wolffd@0	116 disp('Network training and hyperparameter re-estimation are now complete.')
wolffd@0	117 disp('Compare the final value for the hyperparameter beta with the true')
wolffd@0	118 disp('value.')
wolffd@0	119 disp(' ')
wolffd@0	120 disp('Notice that the final error value is close to the number of data')
wolffd@0	121 disp(['points (', num2str(ndata),') divided by two.'])
wolffd@0	122 disp(' ')
wolffd@0	123 disp('Press any key to continue.')
wolffd@0	124 pause; clc;
wolffd@0	125 disp('We can now plot the function represented by the trained network. This')
wolffd@0	126 disp('corresponds to the mean of the predictive distribution. We can also')
wolffd@0	127 disp('plot ''error bars'' representing one standard deviation of the')
wolffd@0	128 disp('predictive distribution around the mean.')
wolffd@0	129 disp(' ')
wolffd@0	130 disp('Press any key to add the network function and error bars to the plot.')
wolffd@0	131 pause;
wolffd@0	132
wolffd@0	133 % Evaluate error bars.
wolffd@0	134 [y, sig2] = netevfwd(netpak(net), net, x, t, plotvals);
wolffd@0	135 sig = sqrt(sig2);
wolffd@0	136
wolffd@0	137 % Plot the data, the original function, and the trained network function.
wolffd@0	138 [y, z] = rbffwd(net, plotvals);
wolffd@0	139 figure(h); hold on;
wolffd@0	140 plot(plotvals, y, '-r')
wolffd@0	141 xlabel('Input')
wolffd@0	142 ylabel('Target')
wolffd@0	143 plot(plotvals, y + sig, '-b');
wolffd@0	144 plot(plotvals, y - sig, '-b');
wolffd@0	145 legend('data', 'function', 'network', 'error bars');
wolffd@0	146
wolffd@0	147 disp(' ')
wolffd@0	148 disp('Notice how the confidence interval spanned by the ''error bars'' is')
wolffd@0	149 disp('smaller in the region of input space where the data density is high,')
wolffd@0	150 disp('and becomes larger in regions away from the data.')
wolffd@0	151 disp(' ')
wolffd@0	152 disp('Press any key to end.')
wolffd@0	153 pause; clc; close(h);
wolffd@0	154

Mercurial > hg > camir-aes2014

annotate toolboxes/FullBNT-1.0.7/netlab3.3/demev3.m @ 0:e9a9cd732c1e tip