Minimum-variance unbiased estimator

(Redirected from Minimum variance unbiased estimator) Unbiased statistical estimator minimizing variance

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Minimum-variance unbiased estimator" – news · newspapers · books · scholar · JSTOR (November 2009) (Learn how and when to remove this message)

In statistics a minimum-variance unbiased estimator (MVUE) or uniformly minimum-variance unbiased estimator (UMVUE) is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.

For practical statistics problems, it is important to determine the MVUE if one exists, since less-than-optimal procedures would naturally be avoided, other things being equal. This has led to substantial development of statistical theory related to the problem of optimal estimation.

While combining the constraint of unbiasedness with the desirability metric of least variance leads to good results in most practical settings—making MVUE a natural starting point for a broad range of analyses—a targeted specification may perform better for a given problem; thus, MVUE is not always the best stopping point.

Definition

Consider estimation of $g(\theta )$ based on data $X_{1},X_{2},\ldots ,X_{n}$ i.i.d. from some member of a family of densities $p_{\theta },\theta \in \Omega$ , where $\Omega$ is the parameter space. An unbiased estimator $\delta (X_{1},X_{2},\ldots ,X_{n})$ of $g(\theta )$ is UMVUE if $\forall \theta \in \Omega$ ,

\operatorname {var} (\delta (X_{1},X_{2},\ldots ,X_{n}))\leq \operatorname {var} ({\tilde {\delta }}(X_{1},X_{2},\ldots ,X_{n}))

for any other unbiased estimator ${\tilde {\delta }}.$

If an unbiased estimator of $g(\theta )$ exists, then one can prove there is an essentially unique MVUE. Using the Rao–Blackwell theorem one can also prove that determining the MVUE is simply a matter of finding a complete sufficient statistic for the family $p_{\theta },\theta \in \Omega$ and conditioning any unbiased estimator on it.

Further, by the Lehmann–Scheffé theorem, an unbiased estimator that is a function of a complete, sufficient statistic is the UMVUE estimator.

Put formally, suppose $\delta (X_{1},X_{2},\ldots ,X_{n})$ is unbiased for $g(\theta )$ , and that $T$ is a complete sufficient statistic for the family of densities. Then

\eta (X_{1},X_{2},\ldots ,X_{n})=\operatorname {E} (\delta (X_{1},X_{2},\ldots ,X_{n})\mid T)\,

is the MVUE for $g(\theta ).$

A Bayesian analog is a Bayes estimator, particularly with minimum mean square error (MMSE).

Estimator selection

An efficient estimator need not exist, but if it does and if it is unbiased, it is the MVUE. Since the mean squared error (MSE) of an estimator δ is

\operatorname {MSE} (\delta )=\operatorname {var} (\delta )+^{2}\

the MVUE minimizes MSE among unbiased estimators. In some cases biased estimators have lower MSE because they have a smaller variance than does any unbiased estimator; see estimator bias.

Example

Consider the data to be a single observation from an absolutely continuous distribution on $\mathbb {R}$ with density

p_{\theta }(x)={\frac {\theta e^{-x}}{(1+e^{-x})^{\theta +1}}}

and we wish to find the UMVU estimator of

g(\theta )={\frac {1}{\theta ^{2}}}

First we recognize that the density can be written as

{\frac {e^{-x}}{1+e^{-x}}}\exp(-\theta \log(1+e^{-x})+\log(\theta ))

Which is an exponential family with sufficient statistic $T=\log(1+e^{-x})$ . In fact this is a full rank exponential family, and therefore $T$ is complete sufficient. See exponential family for a derivation which shows

\operatorname {E} (T)={\frac {1}{\theta }},\quad \operatorname {var} (T)={\frac {1}{\theta ^{2}}}

Therefore,

\operatorname {E} (T^{2})={\frac {2}{\theta ^{2}}}

Here we use Lehmann–Scheffé theorem to get the MVUE

Clearly $\delta (X)={\frac {T^{2}}{2}}$ is unbiased and $T=\log(1+e^{-x})$ is complete sufficient, thus the UMVU estimator is

\eta (X)=\operatorname {E} (\delta (X)\mid T)=\operatorname {E} \left(\left.{\frac {T^{2}}{2}}\,\right|\,T\right)={\frac {T^{2}}{2}}={\frac {\log(1+e^{-X})^{2}}{2}}

This example illustrates that an unbiased function of the complete sufficient statistic will be UMVU, as Lehmann–Scheffé theorem states.

Other examples

For a normal distribution with unknown mean and variance, the sample mean and (unbiased) sample variance are the MVUEs for the population mean and population variance.
However, the sample standard deviation is not unbiased for the population standard deviation – see unbiased estimation of standard deviation.

Further, for other distributions the sample mean and sample variance are not in general MVUEs – for a uniform distribution with unknown upper and lower bounds, the mid-range is the MVUE for the population mean.
If k exemplars are chosen (without replacement) from a discrete uniform distribution over the set {1, 2, ..., N} with unknown upper bound N, the MVUE for N is

{\frac {k+1}{k}}m-1,

where m is the sample maximum. This is a scaled and shifted (so unbiased) transform of the sample maximum, which is a sufficient and complete statistic. See German tank problem for details.

References

Lee, A. J., 1946- (1990). U-statistics : theory and practice. New York: M. Dekker. ISBN 0824782534. OCLC 21523971.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)

Keener, Robert W. (2006). Statistical Theory: Notes for a Course in Theoretical Statistics. Springer. pp. 47–48, 57–58.
Keener, Robert W. (2010). Theoretical statistics: Topics for a core course. New York: Springer. DOI 10.1007/978-0-387-93839-4
Voinov V. G., Nikulin M.S. (1993). Unbiased estimators and their applications, Vol.1: Univariate case. Kluwer Academic Publishers. pp. 521p.

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category:

Estimator

Misplaced Pages