wolffd@0: wolffd@0: wolffd@0: wolffd@0: Netlab Reference Manual olgd wolffd@0: wolffd@0: wolffd@0: wolffd@0:

olgd wolffd@0:

wolffd@0:

wolffd@0: Purpose wolffd@0:

wolffd@0: On-line gradient descent optimization. wolffd@0: wolffd@0:

wolffd@0: Description wolffd@0:

wolffd@0: [net, options, errlog, pointlog] = olgd(net, options, x, t) uses wolffd@0: on-line gradient descent to find a local minimum of the error function for the wolffd@0: network wolffd@0: net computed on the input data x and target values wolffd@0: t. A log of the error values wolffd@0: after each cycle is (optionally) returned in errlog, and a log wolffd@0: of the points visited is (optionally) returned in pointlog. wolffd@0: Because the gradient is computed on-line (i.e. after each pattern) wolffd@0: this can be quite inefficient in Matlab. wolffd@0: wolffd@0:

The error function value at final weight vector is returned wolffd@0: in options(8). wolffd@0: wolffd@0:

The optional parameters have the following interpretations. wolffd@0: wolffd@0:

options(1) is set to 1 to display error values; also logs error wolffd@0: values in the return argument errlog, and the points visited wolffd@0: in the return argument pointslog. If options(1) is set to 0, wolffd@0: then only warning messages are displayed. If options(1) is -1, wolffd@0: then nothing is displayed. wolffd@0: wolffd@0:

options(2) is the precision required for the value wolffd@0: of x at the solution. If the absolute difference between wolffd@0: the values of x between two successive steps is less than wolffd@0: options(2), then this condition is satisfied. wolffd@0: wolffd@0:

options(3) is the precision required of the objective wolffd@0: function at the solution. If the absolute difference between the wolffd@0: error functions between two successive steps is less than wolffd@0: options(3), then this condition is satisfied. wolffd@0: Both this and the previous condition must be wolffd@0: satisfied for termination. Note that testing the function value at each wolffd@0: iteration roughly halves the speed of the algorithm. wolffd@0: wolffd@0:

options(5) determines whether the patterns are sampled randomly wolffd@0: with replacement. If it is 0 (the default), then patterns are sampled wolffd@0: in order. wolffd@0: wolffd@0:

options(6) determines if the learning rate decays. If it is 1 wolffd@0: then the learning rate decays at a rate of 1/t. If it is 0 wolffd@0: (the default) then the learning rate is constant. wolffd@0: wolffd@0:

options(9) should be set to 1 to check the user defined gradient wolffd@0: function. wolffd@0: wolffd@0:

options(10) returns the total number of function evaluations (including wolffd@0: those in any line searches). wolffd@0: wolffd@0:

options(11) returns the total number of gradient evaluations. wolffd@0: wolffd@0:

options(14) is the maximum number of iterations (passes through wolffd@0: the complete pattern set); default 100. wolffd@0: wolffd@0:

options(17) is the momentum; default 0.5. wolffd@0: wolffd@0:

options(18) is the learning rate; default 0.01. wolffd@0: wolffd@0:

wolffd@0: Examples wolffd@0:

wolffd@0: The following example performs on-line gradient descent on an MLP with wolffd@0: random sampling from the pattern set. wolffd@0:
wolffd@0: 
wolffd@0: net = mlp(5, 3, 1, 'linear');
wolffd@0: options = foptions;
wolffd@0: options(18) = 0.01;
wolffd@0: options(5) = 1;
wolffd@0: net = olgd(net, options, x, t);
wolffd@0: 
wolffd@0: wolffd@0: wolffd@0:

wolffd@0: See Also wolffd@0:

wolffd@0: graddesc
wolffd@0: Pages: wolffd@0: Index wolffd@0:
wolffd@0:

Copyright (c) Ian T Nabney (1996-9) wolffd@0: wolffd@0: wolffd@0: wolffd@0: