Local unimodal sampling: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 17:30, 6 December 2010 editMrOllie (talk \| contribs)Extended confirmed users, Pending changes reviewers, Rollbackers236,610 editsNo edit summary← Previous edit		Revision as of 13:56, 13 December 2010 edit undoOptimering (talk \| contribs)264 edits Added references to applications re. notability criticism + minor cleanup. See discussion-page.Next edit →
Line 1:		Line 1:
	{{Proposed deletion/dated
	\|concern = Likely ] article on a technique of questionable notability - has no sources on technique that are independent of its originator.
	\|timestamp = 20101206173004
⚫	}}
	'''Local unimodal sampling (LUS)''' is a method for doing numerical ] which does not require the ] of the problem to be optimized and LUS can hence be used on functions that are not ] or ]. Such optimization methods are also known as direct-search, derivative-free, or black-box methods.		'''Local unimodal sampling (LUS)''' is a method for doing numerical ] which does not require the ] of the problem to be optimized and LUS can hence be used on functions that are not ] or ]. Such optimization methods are also known as direct-search, derivative-free, or black-box methods.

	LUS is attributed to Pedersen <ref name=pedersen08thesis/> and works by ~~maintaining~~ a single position in the search-space and moving to a new position in case of improvement to the fitness or cost function. New positions are sampled from the neighbourhood of the current position using a ]. The sampling-range is initially the full search-space and decreases exponentially during optimization.		LUS is attributed to Pedersen <ref name=pedersen08thesis/> and works by having a single position in the search-space and moving to a new position in case of improvement to the fitness or cost function. New positions are sampled from the neighbourhood of the current position using a ]. The sampling-range is initially the full search-space and decreases exponentially during optimization.

			LUS is similar to the Luus-Jaakola (LJ) optimizer <ref name=luus73optimization/> with the main difference being an improvement in determining the exponential rate of sampling-range decrease, ]. These optimizers have been applied successfully in such diverse fields as ] <ref name=bojkov93application/>, ] <ref name=spaans92importance/>, ] <ref name=papangelakis93reactor/>, ] <ref name=lee99phase/>, and ] <ref name=schuth10thesis/>, to name a few.


	== Motivation ==		== Motivation ==
Line 13:		Line 12:
	]		]

	Using a fixed sampling-range for randomly sampling the search-space of a ] the probability of finding improved positions will decrease as we approach the optimum. This is because a decreasing portion of the sampling-range will yield improved fitness (see pictures for the single-dimensional case.) Hence, the sampling range must be decreased somehow. ~~Pedersen noted that~~ ] works well for optimizing unimodal functions and ~~translated~~ its halving of the sampling-range into a formula that would have similar effect when using a uniform distribution for the sampling.		Using a fixed sampling-range for randomly sampling the search-space of a ] the probability of finding improved positions will decrease as we approach the optimum. This is because a decreasing portion of the sampling-range will yield improved fitness (see pictures for the single-dimensional case.) Hence, the sampling range must be decreased somehow. ] works well for optimizing unimodal functions and its halving of the sampling-range was translated into a formula that would have similar effect when using a uniform distribution for the sampling.

	== Algorithm ==		== Algorithm ==
Line 35:		Line 34:
	The main differences between LUS and the Luus–Jaakola (LJ) method <ref name=luus73optimization/> are that the decrease-factor ''α'' depends on the dimensionality of the search-space ''n'' where as LJ typically just sets ''α'' = 0.95, and that LUS starts out by sampling the entire search-space where as LJ starts out by sampling only a fraction of the search-space.		The main differences between LUS and the Luus–Jaakola (LJ) method <ref name=luus73optimization/> are that the decrease-factor ''α'' depends on the dimensionality of the search-space ''n'' where as LJ typically just sets ''α'' = 0.95, and that LUS starts out by sampling the entire search-space where as LJ starts out by sampling only a fraction of the search-space.

	== ~~Usage and criticism~~ ==		== Criticism ==

		⚫	LUS has been criticized for using a greedy update rule and it has been suggested to use a stochastic update rule as in the ] or ], but it was noted in section 2.3.3 of <ref name=pedersen08thesis/> that such stochastic update rules are not useful when sampling real-valued search-spaces as they merely cause the optimizer to move back and forth towards the optimum with a still low probability of escaping local optima and a now lowered probability of converging to any optimum.
	LUS is used in ] for tuning the behavioural parameters of another optimization method, see e.g. Pedersen <ref name=pedersen08thesis/> <ref name=pedersen08simplifying/>, because it is simple and usually finds satisfactory solutions using few iterations, which is necessary for such computationally expensive optimization problems. The rapidity of LUS is achieved by its exponential decrease of the sampling-range which works well for unimodal problems, but it is a weakness for multi-modal problems as it sometimes decreases the sampling-range too rapidly before it finds the basin holding the global minimum. This means LUS must typically be run several times to locate the minimum of multi-modal problems and sometimes fails completely.

⚫	LUS has ~~also~~ been criticized ~~informally amongst researchers~~ for using a greedy update rule and it has been suggested to use a stochastic update rule as in the ] or ], but it was noted in section 2.3.3 of <ref name=pedersen08thesis/> that such stochastic update rules are not useful when sampling real-valued search-spaces as they merely cause the optimizer to move back and forth towards the optimum with a still low probability of escaping local optima and a now lowered probability of converging to any optimum.

	== History ==

	The abbreviation LUS is the word for ] in the ]. According to Pedersen the name was chosen as a humorous tribute to the ] who have several louse-puns in their films, and not because the method has any resemblance to the movement of a real louse. The LUS method was developed in ] and ], ].

	== See also ==		== See also ==
Line 66:		Line 59:
	</ref>		</ref>

	<ref name=~~pedersen08simplifying~~>		<ref name=schuth10thesis>
	{{cite ~~journal~~		{{cite book
			\|type=MSc thesis
⚫	\|last=~~Pedersen~~
			\|title=Tuning methods in statistical machine translation
⚫	\|first=M.~~E.H~~.
			\|url=http://www.anneschuth.nl/wp-content/uploads/2010/10/thesis.anne_.schuth.pdf
	\|coauthors=Chipperfield, A.J.
		⚫	\|last=Schuth
	\|url=http://www.hvass-labs.org/people/magnus/publications/pedersen08simplifying.pdf
		⚫	\|first=A.G.
	\|title=Simplifying particle swarm optimization
	\|journal=Applied Soft Computing
	\|year=2010		\|year=2010
			\|publisher=University of Amsterdam
⚫	\|volume=10
⚫	\|pages=~~618~~–~~628~~
	}}		}}
	</ref>		</ref>
Line 82:		Line 73:
	<ref name=luus73optimization>		<ref name=luus73optimization>
	{{cite journal		{{cite journal
	\|~~last~~=Luus		\|last1=Luus
	\|~~first~~=R.		\|first1=R.
	\|~~last~~=Jaakola		\|last2=Jaakola
	\|~~first~~=T.H.I.		\|first2=T.H.I.
	\|title=Optimization by direct search and systematic reduction of the size of search region		\|title=Optimization by direct search and systematic reduction of the size of search region
	\|journal=American Institute of Chemical Engineers Journal (AIChE)		\|journal=American Institute of Chemical Engineers Journal (AIChE)
Line 92:		Line 83:
	\|number=4		\|number=4
	\|pages=760–766		\|pages=760–766
		⚫	}}
			</ref>

			<ref name=bojkov93application>
			{{cite journal
			\|last1=Bojkov
			\|first1=R.
			\|last2=Hansel
			\|first2=B.
			\|last3=Luus
			\|first3=R.
			\|title=Application of direct search optimization to optimal control problems
			\|journal=Hungarian Journal of Industrial Chemistry
			\|year=1993
		⚫	\|volume=21
		⚫	\|pages=177–185
			}}
			</ref>

			<ref name=spaans92importance>
			{{cite journal
			\|last1=Spaans
			\|first1=R.
			\|last2=Luus
			\|first2=R.
			\|title=Importance of search-domain reduction in random optimization
			\|journal=Journal of Optimization Theory and Applications
			\|year=1992
			\|volume=75
			\|pages=635–638
			}}
			</ref>

			<ref name=lee99phase>
			{{cite journal
			\|last1=Lee
			\|first1=Y.P.
			\|last2=Rangaiah
			\|first2=G.P.
			\|last3=Luus
			\|first3=R.
			\|title=Phase and chemical equilibrium calculations by direct search optimization
			\|journal=Computers & Chemical Engineering
			\|year=1999
			\|volume=23
			\|number=9
			\|pages=1183–1191
			}}
			</ref>

			<ref name=papangelakis93reactor>
			{{cite conference
			\|last1=Papangelakis
			\|first1=V.G.
			\|last2=Luus
			\|first2=R.
			\|title=Reactor optimization in the pressure oxidization process
			\|booktitle=Proc. Int. Symp. on Modelling, Simulation and Control of Metallurgical Processes
			\|year=1993
			\|pages=159–171
	}}		}}
	</ref>		</ref>

Revision as of 13:56, 13 December 2010

Local unimodal sampling (LUS) is a method for doing numerical optimization which does not require the gradient of the problem to be optimized and LUS can hence be used on functions that are not continuous or differentiable. Such optimization methods are also known as direct-search, derivative-free, or black-box methods.

LUS is attributed to Pedersen and works by having a single position in the search-space and moving to a new position in case of improvement to the fitness or cost function. New positions are sampled from the neighbourhood of the current position using a uniform distribution. The sampling-range is initially the full search-space and decreases exponentially during optimization.

LUS is similar to the Luus-Jaakola (LJ) optimizer with the main difference being an improvement in determining the exponential rate of sampling-range decrease, see below. These optimizers have been applied successfully in such diverse fields as optimal control , transformer design , metallurgical processes , chemical engineering , and machine translation , to name a few.

Motivation

Using a fixed sampling-range for randomly sampling the search-space of a unimodal function the probability of finding improved positions will decrease as we approach the optimum. This is because a decreasing portion of the sampling-range will yield improved fitness (see pictures for the single-dimensional case.) Hence, the sampling range must be decreased somehow. Pattern search works well for optimizing unimodal functions and its halving of the sampling-range was translated into a formula that would have similar effect when using a uniform distribution for the sampling.

Algorithm

Let f: ℝ → ℝ be the fitness or cost function which must be minimized. Let x ∈ ℝ designate a position or candidate solution in the search-space. The LUS algorithm can then be described as:

Initialize x~U(b_lo,b_up) with a random uniform position in the search-space, where b_lo and b_up are the lower and upper boundaries, respectively.
Set the initial sampling range to cover the entire search-space: d = b_up − b_lo
Until a termination criterion is met (e.g. number of iterations performed, or adequate fitness reached), repeat the following:
- Pick a random vector a ~ U(−d, d)
- Add this to the current position x to create the new potential position y = x + a
- If (f(y) < f(x)) then move to the new position by setting x = y, otherwise decrease the sampling-range by multiplying with factor q (see below): d = q d
Now x holds the best-found position.

Sampling-range decrease factor

The factor q for exponentially decreasing the sampling-range is defined by: q = 2 where n is the dimensionality of the search-space and α is a user-adjustable parameter. Setting 0<α<1 causes slower decrease of the sampling-range and setting α > 1 causes more rapid decrease of the sampling-range. Typically it is set to α = 1/3

Note that decreasing the sampling-range n times by factor q results in an overall decrease of q = 2 and for α = 1 this would mean a halving of the sampling-range.

The main differences between LUS and the Luus–Jaakola (LJ) method are that the decrease-factor α depends on the dimensionality of the search-space n where as LJ typically just sets α = 0.95, and that LUS starts out by sampling the entire search-space where as LJ starts out by sampling only a fraction of the search-space.

Criticism

LUS has been criticized for using a greedy update rule and it has been suggested to use a stochastic update rule as in the Metropolis–Hastings algorithm or simulated annealing, but it was noted in section 2.3.3 of that such stochastic update rules are not useful when sampling real-valued search-spaces as they merely cause the optimizer to move back and forth towards the optimum with a still low probability of escaping local optima and a now lowered probability of converging to any optimum.

References

^ Pedersen, M.E.H. (2010). Tuning & Simplifying Heuristical Optimization (PDF) (PhD thesis). University of Southampton, School of Engineering Sciences, Computational Engineering and Design Group.
^ Luus, R.; Jaakola, T.H.I. (1973). "Optimization by direct search and systematic reduction of the size of search region". American Institute of Chemical Engineers Journal (AIChE). 19 (4): 760–766.
Bojkov, R.; Hansel, B.; Luus, R. (1993). "Application of direct search optimization to optimal control problems". Hungarian Journal of Industrial Chemistry. 21: 177–185.
Spaans, R.; Luus, R. (1992). "Importance of search-domain reduction in random optimization". Journal of Optimization Theory and Applications. 75: 635–638.
Papangelakis, V.G.; Luus, R. (1993). "Reactor optimization in the pressure oxidization process". Proc. Int. Symp. on Modelling, Simulation and Control of Metallurgical Processes. pp. 159–171. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
Lee, Y.P.; Rangaiah, G.P.; Luus, R. (1999). "Phase and chemical equilibrium calculations by direct search optimization". Computers & Chemical Engineering. 23 (9): 1183–1191.
Schuth, A.G. (2010). Tuning methods in statistical machine translation (PDF) (MSc thesis). University of Amsterdam.

Optimization: Algorithms, methods, and heuristics

Unconstrained nonlinear

Functions

Gradients

Convergence	Trust region Wolfe conditions
Quasi–Newton	Berndt–Hall–Hall–Hausman Broyden–Fletcher–Goldfarb–Shanno and L-BFGS Davidon–Fletcher–Powell Symmetric rank-one (SR1)
Other methods	Conjugate gradient Gauss–Newton Gradient Mirror Levenberg–Marquardt Powell's dog leg method Truncated Newton

Hessians

Newton's method

Constrained nonlinear

General	Barrier methods Penalty methods
Differentiable	Augmented Lagrangian methods Sequential quadratic programming Successive linear programming

Convex optimization

Convex
minimization

Linear and
quadratic

Interior point	Affine scaling Ellipsoid algorithm of Khachiyan Projective algorithm of Karmarkar
Basis-exchange	Simplex algorithm of Dantzig Revised simplex algorithm Criss-cross algorithm Principal pivoting algorithm of Lemke Active-set method

Combinatorial

Paradigms

Graph
algorithms

Minimum spanning tree	Borůvka Prim Kruskal

Shortest path	Bellman–Ford SPFA Dijkstra Floyd–Warshall

Network flows

Metaheuristics
Evolutionary algorithm Hill climbing Local search Parallel metaheuristics Simulated annealing Spiral optimization algorithm Tabu search

Software

Categories: