wolffd@0
|
1 <html>
|
wolffd@0
|
2 <head>
|
wolffd@0
|
3 <title>
|
wolffd@0
|
4 Netlab Reference Manual olgd
|
wolffd@0
|
5 </title>
|
wolffd@0
|
6 </head>
|
wolffd@0
|
7 <body>
|
wolffd@0
|
8 <H1> olgd
|
wolffd@0
|
9 </H1>
|
wolffd@0
|
10 <h2>
|
wolffd@0
|
11 Purpose
|
wolffd@0
|
12 </h2>
|
wolffd@0
|
13 On-line gradient descent optimization.
|
wolffd@0
|
14
|
wolffd@0
|
15 <p><h2>
|
wolffd@0
|
16 Description
|
wolffd@0
|
17 </h2>
|
wolffd@0
|
18 <CODE>[net, options, errlog, pointlog] = olgd(net, options, x, t)</CODE> uses
|
wolffd@0
|
19 on-line gradient descent to find a local minimum of the error function for the
|
wolffd@0
|
20 network
|
wolffd@0
|
21 <CODE>net</CODE> computed on the input data <CODE>x</CODE> and target values
|
wolffd@0
|
22 <CODE>t</CODE>. A log of the error values
|
wolffd@0
|
23 after each cycle is (optionally) returned in <CODE>errlog</CODE>, and a log
|
wolffd@0
|
24 of the points visited is (optionally) returned in <CODE>pointlog</CODE>.
|
wolffd@0
|
25 Because the gradient is computed on-line (i.e. after each pattern)
|
wolffd@0
|
26 this can be quite inefficient in Matlab.
|
wolffd@0
|
27
|
wolffd@0
|
28 <p>The error function value at final weight vector is returned
|
wolffd@0
|
29 in <CODE>options(8)</CODE>.
|
wolffd@0
|
30
|
wolffd@0
|
31 <p>The optional parameters have the following interpretations.
|
wolffd@0
|
32
|
wolffd@0
|
33 <p><CODE>options(1)</CODE> is set to 1 to display error values; also logs error
|
wolffd@0
|
34 values in the return argument <CODE>errlog</CODE>, and the points visited
|
wolffd@0
|
35 in the return argument <CODE>pointslog</CODE>. If <CODE>options(1)</CODE> is set to 0,
|
wolffd@0
|
36 then only warning messages are displayed. If <CODE>options(1)</CODE> is -1,
|
wolffd@0
|
37 then nothing is displayed.
|
wolffd@0
|
38
|
wolffd@0
|
39 <p><CODE>options(2)</CODE> is the precision required for the value
|
wolffd@0
|
40 of <CODE>x</CODE> at the solution. If the absolute difference between
|
wolffd@0
|
41 the values of <CODE>x</CODE> between two successive steps is less than
|
wolffd@0
|
42 <CODE>options(2)</CODE>, then this condition is satisfied.
|
wolffd@0
|
43
|
wolffd@0
|
44 <p><CODE>options(3)</CODE> is the precision required of the objective
|
wolffd@0
|
45 function at the solution. If the absolute difference between the
|
wolffd@0
|
46 error functions between two successive steps is less than
|
wolffd@0
|
47 <CODE>options(3)</CODE>, then this condition is satisfied.
|
wolffd@0
|
48 Both this and the previous condition must be
|
wolffd@0
|
49 satisfied for termination. Note that testing the function value at each
|
wolffd@0
|
50 iteration roughly halves the speed of the algorithm.
|
wolffd@0
|
51
|
wolffd@0
|
52 <p><CODE>options(5)</CODE> determines whether the patterns are sampled randomly
|
wolffd@0
|
53 with replacement. If it is 0 (the default), then patterns are sampled
|
wolffd@0
|
54 in order.
|
wolffd@0
|
55
|
wolffd@0
|
56 <p><CODE>options(6)</CODE> determines if the learning rate decays. If it is 1
|
wolffd@0
|
57 then the learning rate decays at a rate of <CODE>1/t</CODE>. If it is 0
|
wolffd@0
|
58 (the default) then the learning rate is constant.
|
wolffd@0
|
59
|
wolffd@0
|
60 <p><CODE>options(9)</CODE> should be set to 1 to check the user defined gradient
|
wolffd@0
|
61 function.
|
wolffd@0
|
62
|
wolffd@0
|
63 <p><CODE>options(10)</CODE> returns the total number of function evaluations (including
|
wolffd@0
|
64 those in any line searches).
|
wolffd@0
|
65
|
wolffd@0
|
66 <p><CODE>options(11)</CODE> returns the total number of gradient evaluations.
|
wolffd@0
|
67
|
wolffd@0
|
68 <p><CODE>options(14)</CODE> is the maximum number of iterations (passes through
|
wolffd@0
|
69 the complete pattern set); default 100.
|
wolffd@0
|
70
|
wolffd@0
|
71 <p><CODE>options(17)</CODE> is the momentum; default 0.5.
|
wolffd@0
|
72
|
wolffd@0
|
73 <p><CODE>options(18)</CODE> is the learning rate; default 0.01.
|
wolffd@0
|
74
|
wolffd@0
|
75 <p><h2>
|
wolffd@0
|
76 Examples
|
wolffd@0
|
77 </h2>
|
wolffd@0
|
78 The following example performs on-line gradient descent on an MLP with
|
wolffd@0
|
79 random sampling from the pattern set.
|
wolffd@0
|
80 <PRE>
|
wolffd@0
|
81
|
wolffd@0
|
82 net = mlp(5, 3, 1, 'linear');
|
wolffd@0
|
83 options = foptions;
|
wolffd@0
|
84 options(18) = 0.01;
|
wolffd@0
|
85 options(5) = 1;
|
wolffd@0
|
86 net = olgd(net, options, x, t);
|
wolffd@0
|
87 </PRE>
|
wolffd@0
|
88
|
wolffd@0
|
89
|
wolffd@0
|
90 <p><h2>
|
wolffd@0
|
91 See Also
|
wolffd@0
|
92 </h2>
|
wolffd@0
|
93 <CODE><a href="graddesc.htm">graddesc</a></CODE><hr>
|
wolffd@0
|
94 <b>Pages:</b>
|
wolffd@0
|
95 <a href="index.htm">Index</a>
|
wolffd@0
|
96 <hr>
|
wolffd@0
|
97 <p>Copyright (c) Ian T Nabney (1996-9)
|
wolffd@0
|
98
|
wolffd@0
|
99
|
wolffd@0
|
100 </body>
|
wolffd@0
|
101 </html> |