Daniel@0: function [x, options, flog, pointlog, scalelog] = scg(f, x, options, gradf, varargin)
Daniel@0: %SCG	Scaled conjugate gradient optimization.
Daniel@0: %
Daniel@0: %	Description
Daniel@0: %	[X, OPTIONS] = SCG(F, X, OPTIONS, GRADF) uses a scaled conjugate
Daniel@0: %	gradients algorithm to find a local minimum of the function F(X)
Daniel@0: %	whose gradient is given by GRADF(X).  Here X is a row vector and F
Daniel@0: %	returns a scalar value. The point at which F has a local minimum is
Daniel@0: %	returned as X.  The function value at that point is returned in
Daniel@0: %	OPTIONS(8).
Daniel@0: %
Daniel@0: %	[X, OPTIONS, FLOG, POINTLOG, SCALELOG] = SCG(F, X, OPTIONS, GRADF)
Daniel@0: %	also returns (optionally) a log of the function values after each
Daniel@0: %	cycle in FLOG, a log of the points visited in POINTLOG, and a log of
Daniel@0: %	the scale values in the algorithm in SCALELOG.
Daniel@0: %
Daniel@0: %	SCG(F, X, OPTIONS, GRADF, P1, P2, ...) allows additional arguments to
Daniel@0: %	be passed to F() and GRADF().     The optional parameters have the
Daniel@0: %	following interpretations.
Daniel@0: %
Daniel@0: %	OPTIONS(1) is set to 1 to display error values; also logs error
Daniel@0: %	values in the return argument ERRLOG, and the points visited in the
Daniel@0: %	return argument POINTSLOG.  If OPTIONS(1) is set to 0, then only
Daniel@0: %	warning messages are displayed.  If OPTIONS(1) is -1, then nothing is
Daniel@0: %	displayed.
Daniel@0: %
Daniel@0: %	OPTIONS(2) is a measure of the absolute precision required for the
Daniel@0: %	value of X at the solution.  If the absolute difference between the
Daniel@0: %	values of X between two successive steps is less than OPTIONS(2),
Daniel@0: %	then this condition is satisfied.
Daniel@0: %
Daniel@0: %	OPTIONS(3) is a measure of the precision required of the objective
Daniel@0: %	function at the solution.  If the absolute difference between the
Daniel@0: %	objective function values between two successive steps is less than
Daniel@0: %	OPTIONS(3), then this condition is satisfied. Both this and the
Daniel@0: %	previous condition must be satisfied for termination.
Daniel@0: %
Daniel@0: %	OPTIONS(9) is set to 1 to check the user defined gradient function.
Daniel@0: %
Daniel@0: %	OPTIONS(10) returns the total number of function evaluations
Daniel@0: %	(including those in any line searches).
Daniel@0: %
Daniel@0: %	OPTIONS(11) returns the total number of gradient evaluations.
Daniel@0: %
Daniel@0: %	OPTIONS(14) is the maximum number of iterations; default 100.
Daniel@0: %
Daniel@0: %	See also
Daniel@0: %	CONJGRAD, QUASINEW
Daniel@0: %
Daniel@0: 
Daniel@0: %	Copyright (c) Ian T Nabney (1996-2001)
Daniel@0: 
Daniel@0: %  Set up the options.
Daniel@0: if length(options) < 18
Daniel@0:   error('Options vector too short')
Daniel@0: end
Daniel@0: 
Daniel@0: if(options(14))
Daniel@0:   niters = options(14);
Daniel@0: else
Daniel@0:   niters = 100;
Daniel@0: end
Daniel@0: 
Daniel@0: display = options(1);
Daniel@0: gradcheck = options(9);
Daniel@0: 
Daniel@0: % Set up strings for evaluating function and gradient
Daniel@0: f = fcnchk(f, length(varargin));
Daniel@0: gradf = fcnchk(gradf, length(varargin));
Daniel@0: 
Daniel@0: nparams = length(x);
Daniel@0: 
Daniel@0: %  Check gradients
Daniel@0: if (gradcheck)
Daniel@0:   feval('gradchek', x, f, gradf, varargin{:});
Daniel@0: end
Daniel@0: 
Daniel@0: sigma0 = 1.0e-4;
Daniel@0: fold = feval(f, x, varargin{:});	% Initial function value.
Daniel@0: fnow = fold;
Daniel@0: options(10) = options(10) + 1;		% Increment function evaluation counter.
Daniel@0: gradnew = feval(gradf, x, varargin{:});	% Initial gradient.
Daniel@0: gradold = gradnew;
Daniel@0: options(11) = options(11) + 1;		% Increment gradient evaluation counter.
Daniel@0: d = -gradnew;				% Initial search direction.
Daniel@0: success = 1;				% Force calculation of directional derivs.
Daniel@0: nsuccess = 0;				% nsuccess counts number of successes.
Daniel@0: beta = 1.0;				% Initial scale parameter.
Daniel@0: betamin = 1.0e-15; 			% Lower bound on scale.
Daniel@0: betamax = 1.0e100;			% Upper bound on scale.
Daniel@0: j = 1;					% j counts number of iterations.
Daniel@0: if nargout >= 3
Daniel@0:   flog(j, :) = fold;
Daniel@0:   if nargout == 4
Daniel@0:     pointlog(j, :) = x;
Daniel@0:   end
Daniel@0: end
Daniel@0: 
Daniel@0: % Main optimization loop.
Daniel@0: while (j <= niters)
Daniel@0: 
Daniel@0:   % Calculate first and second directional derivatives.
Daniel@0:   if (success == 1)
Daniel@0:     mu = d*gradnew';
Daniel@0:     if (mu >= 0)
Daniel@0:       d = - gradnew;
Daniel@0:       mu = d*gradnew';
Daniel@0:     end
Daniel@0:     kappa = d*d';
Daniel@0:     if kappa < eps
Daniel@0:       options(8) = fnow;
Daniel@0:       return
Daniel@0:     end
Daniel@0:     sigma = sigma0/sqrt(kappa);
Daniel@0:     xplus = x + sigma*d;
Daniel@0:     gplus = feval(gradf, xplus, varargin{:});
Daniel@0:     options(11) = options(11) + 1; 
Daniel@0:     theta = (d*(gplus' - gradnew'))/sigma;
Daniel@0:   end
Daniel@0: 
Daniel@0:   % Increase effective curvature and evaluate step size alpha.
Daniel@0:   delta = theta + beta*kappa;
Daniel@0:   if (delta <= 0) 
Daniel@0:     delta = beta*kappa;
Daniel@0:     beta = beta - theta/kappa;
Daniel@0:   end
Daniel@0:   alpha = - mu/delta;
Daniel@0:   
Daniel@0:   % Calculate the comparison ratio.
Daniel@0:   xnew = x + alpha*d;
Daniel@0:   fnew = feval(f, xnew, varargin{:});
Daniel@0:   options(10) = options(10) + 1;
Daniel@0:   Delta = 2*(fnew - fold)/(alpha*mu);
Daniel@0:   if (Delta  >= 0)
Daniel@0:     success = 1;
Daniel@0:     nsuccess = nsuccess + 1;
Daniel@0:     x = xnew;
Daniel@0:     fnow = fnew;
Daniel@0:   else
Daniel@0:     success = 0;
Daniel@0:     fnow = fold;
Daniel@0:   end
Daniel@0: 
Daniel@0:   if nargout >= 3
Daniel@0:     % Store relevant variables
Daniel@0:     flog(j) = fnow;		% Current function value
Daniel@0:     if nargout >= 4
Daniel@0:       pointlog(j,:) = x;	% Current position
Daniel@0:       if nargout >= 5
Daniel@0: 	scalelog(j) = beta;	% Current scale parameter
Daniel@0:       end
Daniel@0:     end
Daniel@0:   end    
Daniel@0:   if display > 0
Daniel@0:     fprintf(1, 'Cycle %4d  Error %11.6f  Scale %e\n', j, fnow, beta);
Daniel@0:   end
Daniel@0: 
Daniel@0:   if (success == 1)
Daniel@0:     % Test for termination
Daniel@0: 
Daniel@0:     if (max(abs(alpha*d)) < options(2) & max(abs(fnew-fold)) < options(3))
Daniel@0:       options(8) = fnew;
Daniel@0:       return;
Daniel@0: 
Daniel@0:     else
Daniel@0:       % Update variables for new position
Daniel@0:       fold = fnew;
Daniel@0:       gradold = gradnew;
Daniel@0:       gradnew = feval(gradf, x, varargin{:});
Daniel@0:       options(11) = options(11) + 1;
Daniel@0:       % If the gradient is zero then we are done.
Daniel@0:       if (gradnew*gradnew' == 0)
Daniel@0: 	options(8) = fnew;
Daniel@0: 	return;
Daniel@0:       end
Daniel@0:     end
Daniel@0:   end
Daniel@0: 
Daniel@0:   % Adjust beta according to comparison ratio.
Daniel@0:   if (Delta < 0.25)
Daniel@0:     beta = min(4.0*beta, betamax);
Daniel@0:   end
Daniel@0:   if (Delta > 0.75)
Daniel@0:     beta = max(0.5*beta, betamin);
Daniel@0:   end
Daniel@0: 
Daniel@0:   % Update search direction using Polak-Ribiere formula, or re-start 
Daniel@0:   % in direction of negative gradient after nparams steps.
Daniel@0:   if (nsuccess == nparams)
Daniel@0:     d = -gradnew;
Daniel@0:     nsuccess = 0;
Daniel@0:   else
Daniel@0:     if (success == 1)
Daniel@0:       gamma = (gradold - gradnew)*gradnew'/(mu);
Daniel@0:       d = gamma*d - gradnew;
Daniel@0:     end
Daniel@0:   end
Daniel@0:   j = j + 1;
Daniel@0: end
Daniel@0: 
Daniel@0: % If we get here, then we haven't terminated in the given number of 
Daniel@0: % iterations.
Daniel@0: 
Daniel@0: options(8) = fold;
Daniel@0: if (options(1) >= 0)
Daniel@0:   disp(maxitmess);
Daniel@0: end
Daniel@0: