Scoring algorithm

Scoring algorithm, also known as Fisher's scoring, is a form of Newton's method used in statistics to solve maximum likelihood equations numerically, named after Ronald Fisher.

Sketch of derivation

Let $Y_{1},\ldots ,Y_{n}$ be random variables, independent and identically distributed with twice differentiable p.d.f. $f(y;\theta )$ , and we wish to calculate the maximum likelihood estimator (M.L.E.) $\theta ^{*}$ of $\theta$ . First, suppose we have a starting point for our algorithm $\theta _{0}$ , and consider a Taylor expansion of the score function, $V(\theta )$ , about $\theta _{0}$ :

V(\theta )\approx V(\theta _{0})-{\mathcal {J}}(\theta _{0})(\theta -\theta _{0}),\,

where

{\mathcal {J}}(\theta _{0})=-\sum _{i=1}^{n}\left.\nabla \nabla ^{\top }\right|_{\theta =\theta _{0}}\log f(Y_{i};\theta )

is the observed information matrix at $\theta _{0}$ . Now, setting $\theta =\theta ^{*}$ , using that $V(\theta ^{*})=0$ and rearranging gives us:

\theta ^{*}\approx \theta _{0}+{\mathcal {J}}^{-1}(\theta _{0})V(\theta _{0}).\,

We therefore use the algorithm

\theta _{m+1}=\theta _{m}+{\mathcal {J}}^{-1}(\theta _{m})V(\theta _{m}),\,

and under certain regularity conditions, it can be shown that $\theta _{m}\rightarrow \theta ^{*}$ .

Fisher scoring

In practice, ${\mathcal {J}}(\theta )$ is usually replaced by ${\mathcal {I}}(\theta )=\mathrm {E}$ , the Fisher information, thus giving us the Fisher Scoring Algorithm:

\theta _{m+1}=\theta _{m}+{\mathcal {I}}^{-1}(\theta _{m})V(\theta _{m})

Under some regularity conditions, if $\theta _{m}$ is a consistent estimator, then $\theta _{m+1}$ (the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.

References

Longford, Nicholas T. (1987). "A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects". Biometrika. 74 (4): 817–827. doi:10.1093/biomet/74.4.817.
Li, Bing; Babu, G. Jogesh (2019), "Bayesian Inference", Springer Texts in Statistics, New York, NY: Springer New York, Theorem 9.4, doi:10.1007/978-1-4939-9761-9_6, ISBN 978-1-4939-9759-6, S2CID 239322258, retrieved 2023-01-03

Jennrich, R. I. & Sampson, P. F. (1976). "Newton-Raphson and Related Algorithms for Maximum Likelihood Variance Component Estimation". Technometrics. 18 (1): 11–17. doi:10.1080/00401706.1976.10489395 (inactive 1 November 2024). JSTOR 1267911.{{cite journal}}: CS1 maint: DOI inactive as of November 2024 (link)

Optimization: Algorithms, methods, and heuristics

Unconstrained nonlinear

Functions

Gradients

Convergence	Trust region Wolfe conditions
Quasi–Newton	Berndt–Hall–Hall–Hausman Broyden–Fletcher–Goldfarb–Shanno and L-BFGS Davidon–Fletcher–Powell Symmetric rank-one (SR1)
Other methods	Conjugate gradient Gauss–Newton Gradient Mirror Levenberg–Marquardt Powell's dog leg method Truncated Newton

Hessians

Newton's method

Constrained nonlinear

General	Barrier methods Penalty methods
Differentiable	Augmented Lagrangian methods Sequential quadratic programming Successive linear programming

Convex optimization

Convex
minimization

Linear and
quadratic

Interior point	Affine scaling Ellipsoid algorithm of Khachiyan Projective algorithm of Karmarkar
Basis-exchange	Simplex algorithm of Dantzig Revised simplex algorithm Criss-cross algorithm Principal pivoting algorithm of Lemke Active-set method

Combinatorial

Paradigms

Graph
algorithms

Minimum spanning tree	Borůvka Prim Kruskal

Shortest path	Bellman–Ford SPFA Dijkstra Floyd–Warshall

Network flows

Metaheuristics
Evolutionary algorithm Hill climbing Local search Parallel metaheuristics Simulated annealing Spiral optimization algorithm Tabu search

Software

Category:

Maximum likelihood estimation

Misplaced Pages

Sketch of derivation

Fisher scoring

See also

References

Further reading