Misplaced Pages

Chapman–Robbins bound

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both tighter and applicable to a wider range of problems. However, it is usually more difficult to compute.

The bound was independently discovered by John Hammersley in 1950, and by Douglas Chapman and Herbert Robbins in 1951.

Statement

Let Θ {\displaystyle \Theta } be the set of parameters for a family of probability distributions { μ θ : θ Θ } {\displaystyle \{\mu _{\theta }:\theta \in \Theta \}} on Ω {\displaystyle \Omega } .

For any two θ , θ Θ {\displaystyle \theta ,\theta '\in \Theta } , let χ 2 ( μ θ ; μ θ ) {\displaystyle \chi ^{2}(\mu _{\theta '};\mu _{\theta })} be the χ 2 {\displaystyle \chi ^{2}} -divergence from μ θ {\displaystyle \mu _{\theta }} to μ θ {\displaystyle \mu _{\theta '}} . Then:

Theorem — Given any scalar random variable g ^ : Ω R {\displaystyle {\hat {g}}:\Omega \to \mathbb {R} } , and any two θ , θ Θ {\displaystyle \theta ,\theta '\in \Theta } , we have Var θ [ g ^ ] sup θ θ Θ ( E θ [ g ^ ] E θ [ g ^ ] ) 2 χ 2 ( μ θ ; μ θ ) {\displaystyle \operatorname {Var} _{\theta }\geq \sup _{\theta '\neq \theta \in \Theta }{\frac {(E_{\theta '}-E_{\theta })^{2}}{\chi ^{2}(\mu _{\theta '};\mu _{\theta })}}} .

A generalization to the multivariable case is:

Theorem — Given any multivariate random variable g ^ : Ω R m {\displaystyle {\hat {g}}:\Omega \to \mathbb {R} ^{m}} , and any θ , θ Θ {\displaystyle \theta ,\theta '\in \Theta } , χ 2 ( μ θ ; μ θ ) ( E θ [ g ^ ] E θ [ g ^ ] ) T Cov θ [ g ^ ] 1 ( E θ [ g ^ ] E θ [ g ^ ] ) {\displaystyle \chi ^{2}(\mu _{\theta '};\mu _{\theta })\geq (E_{\theta '}-E_{\theta })^{T}\operatorname {Cov} _{\theta }^{-1}(E_{\theta '}-E_{\theta })}

Proof

By the variational representation of chi-squared divergence: χ 2 ( P ; Q ) = sup g ( E P [ g ] E Q [ g ] ) 2 Var Q [ g ] {\displaystyle \chi ^{2}(P;Q)=\sup _{g}{\frac {(E_{P}-E_{Q})^{2}}{\operatorname {Var} _{Q}}}} Plug in g = g ^ , P = μ θ , Q = μ θ {\displaystyle g={\hat {g}},P=\mu _{\theta '},Q=\mu _{\theta }} , to obtain: χ 2 ( μ θ ; μ θ ) ( E θ [ g ^ ] E θ [ g ^ ] ) 2 Var θ [ g ^ ] {\displaystyle \chi ^{2}(\mu _{\theta '};\mu _{\theta })\geq {\frac {(E_{\theta '}-E_{\theta })^{2}}{\operatorname {Var} _{\theta }}}} Switch the denominator and the left side and take supremum over θ {\displaystyle \theta '} to obtain the single-variate case. For the multivariate case, we define h = i = 1 m v i g ^ i {\textstyle h=\sum _{i=1}^{m}v_{i}{\hat {g}}_{i}} for any v 0 R m {\displaystyle v\neq 0\in \mathbb {R} ^{m}} . Then plug in g = h {\displaystyle g=h} in the variational representation to obtain: χ 2 ( μ θ ; μ θ ) ( E θ [ h ] E θ [ h ] ) 2 Var θ [ h ] = v , E θ [ g ^ ] E θ [ g ^ ] 2 v T Cov θ [ g ^ ] v {\displaystyle \chi ^{2}(\mu _{\theta '};\mu _{\theta })\geq {\frac {(E_{\theta '}-E_{\theta })^{2}}{\operatorname {Var} _{\theta }}}={\frac {\langle v,E_{\theta '}-E_{\theta }\rangle ^{2}}{v^{T}\operatorname {Cov} _{\theta }v}}} Take supremum over v 0 R m {\displaystyle v\neq 0\in \mathbb {R} ^{m}} , using the linear algebra fact that sup v 0 v T w w T v v T M v = w T M 1 w {\displaystyle \sup _{v\neq 0}{\frac {v^{T}ww^{T}v}{v^{T}Mv}}=w^{T}M^{-1}w} , we obtain the multivariate case.

Relation to Cramér–Rao bound

Usually, Ω = X n {\displaystyle \Omega ={\mathcal {X}}^{n}} is the sample space of n {\displaystyle n} independent draws of a X {\displaystyle {\mathcal {X}}} -valued random variable X {\displaystyle X} with distribution λ θ {\displaystyle \lambda _{\theta }} from a by θ Θ R m {\displaystyle \theta \in \Theta \subseteq \mathbb {R} ^{m}} parameterized family of probability distributions, μ θ = λ θ n {\displaystyle \mu _{\theta }=\lambda _{\theta }^{\otimes n}} is its n {\displaystyle n} -fold product measure, and g ^ : X n Θ {\displaystyle {\hat {g}}:{\mathcal {X}}^{n}\to \Theta } is an estimator of θ {\displaystyle \theta } . Then, for m = 1 {\displaystyle m=1} , the expression inside the supremum in the Chapman–Robbins bound converges to the Cramér–Rao bound of g ^ {\displaystyle {\hat {g}}} when θ θ {\displaystyle \theta '\to \theta } , assuming the regularity conditions of the Cramér–Rao bound hold. This implies that, when both bounds exist, the Chapman–Robbins version is always at least as tight as the Cramér–Rao bound; in many cases, it is substantially tighter.

The Chapman–Robbins bound also holds under much weaker regularity conditions. For example, no assumption is made regarding differentiability of the probability density function p(x; θ) of λ θ {\displaystyle \lambda _{\theta }} . When p(x; θ) is non-differentiable, the Fisher information is not defined, and hence the Cramér–Rao bound does not exist.

See also

References

  1. Hammersley, J. M. (1950), "On estimating restricted parameters", Journal of the Royal Statistical Society, Series B, 12 (2): 192–240, doi:10.1111/j.2517-6161.1950.tb00056.x, JSTOR 2983981, MR 0040631
  2. Chapman, D. G.; Robbins, H. (1951), "Minimum variance estimation without regularity assumptions", Annals of Mathematical Statistics, 22 (4): 581–586, doi:10.1214/aoms/1177729548, JSTOR 2236927, MR 0044084
  3. ^ Polyanskiy, Yury (2017). "Lecture notes on information theory, chapter 29, ECE563 (UIUC)" (PDF). Lecture notes on information theory. Archived (PDF) from the original on 2022-05-24. Retrieved 2022-05-24.

Further reading

  • Lehmann, E. L.; Casella, G. (1998), Theory of Point Estimation (2nd ed.), Springer, pp. 113–114, ISBN 0-387-98502-6
Categories: