wolffd@0: wolffd@0:
wolffd@0:[net, options, errlog, pointlog] = olgd(net, options, x, t)
uses
wolffd@0: on-line gradient descent to find a local minimum of the error function for the
wolffd@0: network
wolffd@0: net
computed on the input data x
and target values
wolffd@0: t
. A log of the error values
wolffd@0: after each cycle is (optionally) returned in errlog
, and a log
wolffd@0: of the points visited is (optionally) returned in pointlog
.
wolffd@0: Because the gradient is computed on-line (i.e. after each pattern)
wolffd@0: this can be quite inefficient in Matlab.
wolffd@0:
wolffd@0: The error function value at final weight vector is returned
wolffd@0: in options(8)
.
wolffd@0:
wolffd@0:
The optional parameters have the following interpretations. wolffd@0: wolffd@0:
options(1)
is set to 1 to display error values; also logs error
wolffd@0: values in the return argument errlog
, and the points visited
wolffd@0: in the return argument pointslog
. If options(1)
is set to 0,
wolffd@0: then only warning messages are displayed. If options(1)
is -1,
wolffd@0: then nothing is displayed.
wolffd@0:
wolffd@0:
options(2)
is the precision required for the value
wolffd@0: of x
at the solution. If the absolute difference between
wolffd@0: the values of x
between two successive steps is less than
wolffd@0: options(2)
, then this condition is satisfied.
wolffd@0:
wolffd@0:
options(3)
is the precision required of the objective
wolffd@0: function at the solution. If the absolute difference between the
wolffd@0: error functions between two successive steps is less than
wolffd@0: options(3)
, then this condition is satisfied.
wolffd@0: Both this and the previous condition must be
wolffd@0: satisfied for termination. Note that testing the function value at each
wolffd@0: iteration roughly halves the speed of the algorithm.
wolffd@0:
wolffd@0:
options(5)
determines whether the patterns are sampled randomly
wolffd@0: with replacement. If it is 0 (the default), then patterns are sampled
wolffd@0: in order.
wolffd@0:
wolffd@0:
options(6)
determines if the learning rate decays. If it is 1
wolffd@0: then the learning rate decays at a rate of 1/t
. If it is 0
wolffd@0: (the default) then the learning rate is constant.
wolffd@0:
wolffd@0:
options(9)
should be set to 1 to check the user defined gradient
wolffd@0: function.
wolffd@0:
wolffd@0:
options(10)
returns the total number of function evaluations (including
wolffd@0: those in any line searches).
wolffd@0:
wolffd@0:
options(11)
returns the total number of gradient evaluations.
wolffd@0:
wolffd@0:
options(14)
is the maximum number of iterations (passes through
wolffd@0: the complete pattern set); default 100.
wolffd@0:
wolffd@0:
options(17)
is the momentum; default 0.5.
wolffd@0:
wolffd@0:
options(18)
is the learning rate; default 0.01.
wolffd@0:
wolffd@0:
wolffd@0: wolffd@0: net = mlp(5, 3, 1, 'linear'); wolffd@0: options = foptions; wolffd@0: options(18) = 0.01; wolffd@0: options(5) = 1; wolffd@0: net = olgd(net, options, x, t); wolffd@0:wolffd@0: wolffd@0: wolffd@0:
graddesc
Copyright (c) Ian T Nabney (1996-9) wolffd@0: wolffd@0: wolffd@0: wolffd@0: