camir-aes2014: toolboxes/FullBNT-1.0.7/nethelp3.3/graddesc.htm annotate

annotate toolboxes/FullBNT-1.0.7/nethelp3.3/graddesc.htm @ 0:e9a9cd732c1e tip

first hg version after svn

author	wolffd
date	Tue, 10 Feb 2015 15:05:51 +0000
parents
children

rev	line source
wolffd@0	1 <html>
wolffd@0	2 <head>
wolffd@0	3 <title>
wolffd@0	4 Netlab Reference Manual graddesc
wolffd@0	5 </title>
wolffd@0	6 </head>
wolffd@0	7 <body>
wolffd@0	8 <H1> graddesc
wolffd@0	9 </H1>
wolffd@0	10 <h2>
wolffd@0	11 Purpose
wolffd@0	12 </h2>
wolffd@0	13 Gradient descent optimization.
wolffd@0	14
wolffd@0	15 <p><h2>
wolffd@0	16 Description
wolffd@0	17 </h2>
wolffd@0	18 <CODE>[x, options, flog, pointlog] = graddesc(f, x, options, gradf)</CODE> uses
wolffd@0	19 batch gradient descent to find a local minimum of the function
wolffd@0	20 <CODE>f(x)</CODE> whose gradient is given by <CODE>gradf(x)</CODE>. A log of the function values
wolffd@0	21 after each cycle is (optionally) returned in <CODE>errlog</CODE>, and a log
wolffd@0	22 of the points visited is (optionally) returned in <CODE>pointlog</CODE>.
wolffd@0	23
wolffd@0	24 <p>Note that <CODE>x</CODE> is a row vector
wolffd@0	25 and <CODE>f</CODE> returns a scalar value.
wolffd@0	26 The point at which <CODE>f</CODE> has a local minimum
wolffd@0	27 is returned as <CODE>x</CODE>. The function value at that point is returned
wolffd@0	28 in <CODE>options(8)</CODE>.
wolffd@0	29
wolffd@0	30 <p><CODE>graddesc(f, x, options, gradf, p1, p2, ...)</CODE> allows
wolffd@0	31 additional arguments to be passed to <CODE>f()</CODE> and <CODE>gradf()</CODE>.
wolffd@0	32
wolffd@0	33 <p>The optional parameters have the following interpretations.
wolffd@0	34
wolffd@0	35 <p><CODE>options(1)</CODE> is set to 1 to display error values; also logs error
wolffd@0	36 values in the return argument <CODE>errlog</CODE>, and the points visited
wolffd@0	37 in the return argument <CODE>pointslog</CODE>. If <CODE>options(1)</CODE> is set to 0,
wolffd@0	38 then only warning messages are displayed. If <CODE>options(1)</CODE> is -1,
wolffd@0	39 then nothing is displayed.
wolffd@0	40
wolffd@0	41 <p><CODE>options(2)</CODE> is the absolute precision required for the value
wolffd@0	42 of <CODE>x</CODE> at the solution. If the absolute difference between
wolffd@0	43 the values of <CODE>x</CODE> between two successive steps is less than
wolffd@0	44 <CODE>options(2)</CODE>, then this condition is satisfied.
wolffd@0	45
wolffd@0	46 <p><CODE>options(3)</CODE> is a measure of the precision required of the objective
wolffd@0	47 function at the solution. If the absolute difference between the
wolffd@0	48 objective function values between two successive steps is less than
wolffd@0	49 <CODE>options(3)</CODE>, then this condition is satisfied.
wolffd@0	50 Both this and the previous condition must be
wolffd@0	51 satisfied for termination.
wolffd@0	52
wolffd@0	53 <p><CODE>options(7)</CODE> determines the line minimisation method used. If it
wolffd@0	54 is set to 1 then a line minimiser is used (in the direction of the negative
wolffd@0	55 gradient). If it is 0 (the default), then each parameter update
wolffd@0	56 is a fixed multiple (the learning rate)
wolffd@0	57 of the negative gradient added to a fixed multiple (the momentum) of
wolffd@0	58 the previous parameter update.
wolffd@0	59
wolffd@0	60 <p><CODE>options(9)</CODE> should be set to 1 to check the user defined gradient
wolffd@0	61 function <CODE>gradf</CODE> with <CODE>gradchek</CODE>. This is carried out at
wolffd@0	62 the initial parameter vector <CODE>x</CODE>.
wolffd@0	63
wolffd@0	64 <p><CODE>options(10)</CODE> returns the total number of function evaluations (including
wolffd@0	65 those in any line searches).
wolffd@0	66
wolffd@0	67 <p><CODE>options(11)</CODE> returns the total number of gradient evaluations.
wolffd@0	68
wolffd@0	69 <p><CODE>options(14)</CODE> is the maximum number of iterations; default 100.
wolffd@0	70
wolffd@0	71 <p><CODE>options(15)</CODE> is the precision in parameter space of the line search;
wolffd@0	72 default <CODE>foptions(2)</CODE>.
wolffd@0	73
wolffd@0	74 <p><CODE>options(17)</CODE> is the momentum; default 0.5. It should be scaled by the
wolffd@0	75 inverse of the number of data points.
wolffd@0	76
wolffd@0	77 <p><CODE>options(18)</CODE> is the learning rate; default 0.01. It should be
wolffd@0	78 scaled by the inverse of the number of data points.
wolffd@0	79
wolffd@0	80 <p><h2>
wolffd@0	81 Examples
wolffd@0	82 </h2>
wolffd@0	83 An example of how this function can be used to train a neural network is:
wolffd@0	84 <PRE>
wolffd@0	85
wolffd@0	86 options = zeros(1, 18);
wolffd@0	87 options(17) = 0.1/size(x, 1);
wolffd@0	88 net = netopt(net, options, x, t, 'graddesc');
wolffd@0	89 </PRE>
wolffd@0	90
wolffd@0	91 Note how the learning rate is scaled by the number of data points.
wolffd@0	92
wolffd@0	93 <p><h2>
wolffd@0	94 See Also
wolffd@0	95 </h2>
wolffd@0	96 <CODE><a href="conjgrad.htm">conjgrad</a></CODE>, <CODE><a href="linemin.htm">linemin</a></CODE>, <CODE><a href="olgd.htm">olgd</a></CODE>, <CODE><a href="minbrack.htm">minbrack</a></CODE>, <CODE><a href="quasinew.htm">quasinew</a></CODE>, <CODE><a href="scg.htm">scg</a></CODE><hr>
wolffd@0	97 <b>Pages:</b>
wolffd@0	98 <a href="index.htm">Index</a>
wolffd@0	99 <hr>
wolffd@0	100 <p>Copyright (c) Ian T Nabney (1996-9)
wolffd@0	101
wolffd@0	102
wolffd@0	103 </body>
wolffd@0	104 </html>

Mercurial > hg > camir-aes2014

annotate toolboxes/FullBNT-1.0.7/nethelp3.3/graddesc.htm @ 0:e9a9cd732c1e tip