Misplaced Pages

Trust region

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Term in mathematical optimization

In mathematical optimization, a trust region is the subset of the region of the objective function that is approximated using a model function (often a quadratic). If an adequate model of the objective function is found within the trust region, then the region is expanded; conversely, if the approximation is poor, then the region is contracted.

The fit is evaluated by comparing the ratio of expected improvement from the model approximation with the actual improvement observed in the objective function. Simple thresholding of the ratio is used as the criterion for expansion and contraction—a model function is "trusted" only in the region where it provides a reasonable approximation.

Trust-region methods are in some sense dual to line-search methods: trust-region methods first choose a step size (the size of the trust region) and then a step direction, while line-search methods first choose a step direction and then a step size.

The general idea behind trust region methods is known by many names; the earliest use of the term seems to be by Sorensen (1982). A popular textbook by Fletcher (1980) calls these algorithms restricted-step methods. Additionally, in an early foundational work on the method, Goldfeld, Quandt, and Trotter (1966) refer to it as quadratic hill-climbing.

Example

Conceptually, in the Levenberg–Marquardt algorithm, the objective function is iteratively approximated by a quadratic surface, then using a linear solver, the estimate is updated. This alone may not converge nicely if the initial guess is too far from the optimum. For this reason, the algorithm instead restricts each step, preventing it from stepping "too far". It operationalizes "too far" as follows. Rather than solving A Δ x = b {\displaystyle A\,\Delta x=b} for Δ x {\displaystyle \Delta x} , it solves ( A + λ diag ( A ) ) Δ x = b {\displaystyle {\big (}A+\lambda \operatorname {diag} (A){\big )}\,\Delta x=b} , where diag ( A ) {\displaystyle \operatorname {diag} (A)} is the diagonal matrix with the same diagonal as A, and λ is a parameter that controls the trust-region size. Geometrically, this adds a paraboloid centered at Δ x = 0 {\displaystyle \Delta x=0} to the quadratic form, resulting in a smaller step.

The trick is to change the trust-region size (λ). At each iteration, the damped quadratic fit predicts a certain reduction in the cost function, Δ f pred {\displaystyle \Delta f_{\text{pred}}} , which we would expect to be a smaller reduction than the true reduction. Given Δ x {\displaystyle \Delta x} , we can evaluate

Δ f actual = f ( x ) f ( x + Δ x ) . {\displaystyle \Delta f_{\text{actual}}=f(x)-f(x+\Delta x).}

By looking at the ratio Δ f pred / Δ f actual {\displaystyle \Delta f_{\text{pred}}/\Delta f_{\text{actual}}} , we can adjust the trust-region size. In general, we expect Δ f pred {\displaystyle \Delta f_{\text{pred}}} to be a bit smaller than Δ f actual {\displaystyle \Delta f_{\text{actual}}} , and so the ratio would be between, say, 0.25 and 0.5. If the ratio is more than 0.5, then we are damping the step too much, so expand the trust region (decrease λ) and iterate. If the ratio is smaller than 0.25, then the true function is diverging "too much" from the trust-region approximation, so shrink the trust region (increase λ) and try again.

References

  1. Sorensen, D. C. (1982). "Newton's Method with a Model Trust Region Modification". SIAM J. Numer. Anal. 19 (2): 409–426. Bibcode:1982SJNA...19..409S. doi:10.1137/0719026.
  2. Fletcher, Roger (1987) . "Restricted Step Methods". Practical Methods of Optimization (Second ed.). Wiley. ISBN 0-471-91547-5.
  3. Goldfeld, Stephen M.; Quandt, Richard E.; Trotter, Hale F. (1966). "Maximization by Quadratic Hill-Climbing". Econometrica. 34 (3): 541–551. doi:10.2307/1909768. JSTOR 1909768.

External links

Optimization: Algorithms, methods, and heuristics
Unconstrained nonlinear
Functions
Gradients
Convergence
Quasi–Newton
Other methods
Hessians
Graph of a strictly concave quadratic function with unique maximum.
Optimization computes maxima and minima.
Constrained nonlinear
General
Differentiable
Convex optimization
Convex
minimization
Linear and
quadratic
Interior point
Basis-exchange
Combinatorial
Paradigms
Graph
algorithms
Minimum
spanning tree
Shortest path
Network flows
Metaheuristics
Category: