Misplaced Pages

g-prior

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Type of probability distribution used in statistics

In statistics, the g-prior is an objective prior for the regression coefficients of a multiple regression. It was introduced by Arnold Zellner. It is a key tool in Bayes and empirical Bayes variable selection.

Definition

Consider a data set ( x 1 , y 1 ) , , ( x n , y n ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} , where the x i {\displaystyle x_{i}} are Euclidean vectors and the y i {\displaystyle y_{i}} are scalars. The multiple regression model is formulated as

y i = x i β + ε i . {\displaystyle y_{i}=x_{i}^{\top }\beta +\varepsilon _{i}.}

where the ε i {\displaystyle \varepsilon _{i}} are random errors. Zellner's g-prior for β {\displaystyle \beta } is a multivariate normal distribution with covariance matrix proportional to the inverse Fisher information matrix for β {\displaystyle \beta } , similar to a Jeffreys prior.

Assume the ε i {\displaystyle \varepsilon _{i}} are i.i.d. normal with zero mean and variance ψ 1 {\displaystyle \psi ^{-1}} . Let X {\displaystyle X} be the matrix with i {\displaystyle i} th row equal to x i {\displaystyle x_{i}^{\top }} . Then the g-prior for β {\displaystyle \beta } is the multivariate normal distribution with prior mean a hyperparameter β 0 {\displaystyle \beta _{0}} and covariance matrix proportional to ψ 1 ( X X ) 1 {\displaystyle \psi ^{-1}(X^{\top }X)^{-1}} , i.e.,

β | ψ N [ β 0 , g ψ 1 ( X X ) 1 ] . {\displaystyle \beta |\psi \sim {\text{N}}.}

where g is a positive scalar parameter.

Posterior distribution of beta

The posterior distribution of β {\displaystyle \beta } is given as

β | ψ , x , y N [ q β ^ + ( 1 q ) β 0 , q ψ ( X X ) 1 ] . {\displaystyle \beta |\psi ,x,y\sim {\text{N}}{\Big }.}

where q = g / ( 1 + g ) {\displaystyle q=g/(1+g)} and

β ^ = ( X X ) 1 X y . {\displaystyle {\hat {\beta }}=(X^{\top }X)^{-1}X^{\top }y.}

is the maximum likelihood (least squares) estimator of β {\displaystyle \beta } . The vector of regression coefficients β {\displaystyle \beta } can be estimated by its posterior mean under the g-prior, i.e., as the weighted average of the maximum likelihood estimator and β 0 {\displaystyle \beta _{0}} ,

β ~ = q β ^ + ( 1 q ) β 0 . {\displaystyle {\tilde {\beta }}=q{\hat {\beta }}+(1-q)\beta _{0}.}

Clearly, as g →∞, the posterior mean converges to the maximum likelihood estimator.

Selection of g

Estimation of g is slightly less straightforward than estimation of β {\displaystyle \beta } . A variety of methods have been proposed, including Bayes and empirical Bayes estimators.

References

  1. Zellner, A. (1986). "On Assessing Prior Distributions and Bayesian Regression Analysis with g Prior Distributions". In Goel, P.; Zellner, A. (eds.). Bayesian Inference and Decision Techniques: Essays in Honor of Bruno de Finetti. Studies in Bayesian Econometrics and Statistics. Vol. 6. New York: Elsevier. pp. 233–243. ISBN 978-0-444-87712-3.
  2. George, E.; Foster, D. P. (2000). "Calibration and empirical Bayes variable selection". Biometrika. 87 (4): 731–747. CiteSeerX 10.1.1.18.3731. doi:10.1093/biomet/87.4.731.
  3. ^ Liang, F.; Paulo, R.; Molina, G.; Clyde, M. A.; Berger, J. O. (2008). "Mixtures of g priors for Bayesian variable selection". Journal of the American Statistical Association. 103 (481): 410–423. CiteSeerX 10.1.1.206.235. doi:10.1198/016214507000001337.

Further reading

  • Datta, Jyotishka; Ghosh, Jayanta K. (2015). "In Search of Optimal Objective Priors for Model Selection and Estimation". In Upadhyay, Satyanshu Kumar; et al. (eds.). Current Trends in Bayesian Methodology with Applications. CRC Press. pp. 225–243. ISBN 978-1-4822-3511-1.
  • Marin, Jean-Michel; Robert, Christian P. (2007). "Regression and Variable Selection". Bayesian Core : A Practical Approach to Computational Bayesian Statistics. New York: Springer. pp. 47–84. doi:10.1007/978-0-387-38983-7_3. ISBN 978-0-387-38979-0.
Categories: