Working–Hotelling procedure

Method of simultaneous inference

Regression analysis
Part of a series on
Models
Linear regression Simple regression Polynomial regression General linear model
Generalized linear model Vector generalized linear model Discrete choice Binomial regression Binary regression Logistic regression Multinomial logistic regression Mixed logit Probit Multinomial probit Ordered logit Ordered probit Poisson
Multilevel model Fixed effects Random effects Linear mixed-effects model Nonlinear mixed-effects model
Nonlinear regression Nonparametric Semiparametric Robust Quantile Isotonic Principal components Least angle Local Segmented
Errors-in-variables
Estimation
Least squares Linear Non-linear
Ordinary Weighted Generalized Generalized estimating equation
Partial Total Non-negative Ridge regression Regularized
Least absolute deviations Iteratively reweighted Bayesian Bayesian multivariate Least-squares spectral analysis
Background
Regression validation Mean and predicted response Errors and residuals Goodness of fit Studentized residual Gauss–Markov theorem
Mathematics portal
v t e

In statistics, particularly regression analysis, the Working–Hotelling procedure, named after Holbrook Working and Harold Hotelling, is a method of simultaneous estimation in linear regression models. One of the first developments in simultaneous inference, it was devised by Working and Hotelling for the simple linear regression model in 1929. It provides a confidence region for multiple mean responses, that is, it gives the upper and lower bounds of more than one value of a dependent variable at several levels of the independent variables at a certain confidence level. The resulting confidence bands are known as the Working–Hotelling–Scheffé confidence bands.

Like the closely related Scheffé's method in the analysis of variance, which considers all possible contrasts, the Working–Hotelling procedure considers all possible values of the independent variables; that is, in a particular regression model, the probability that all the Working–Hotelling confidence intervals cover the true value of the mean response is the confidence coefficient. As such, when only a small subset of the possible values of the independent variable is considered, it is more conservative and yields wider intervals than competitors like the Bonferroni correction at the same level of confidence. It outperforms the Bonferroni correction as more values are considered.

Statement

Simple linear regression

Consider a simple linear regression model $Y=\beta _{0}+\beta _{1}X+\varepsilon$ , where $Y$ is the response variable and $X$ the explanatory variable, and let $b_{0}$ and $b_{1}$ be the least-squares estimates of $\beta _{0}$ and $\beta _{1}$ respectively. Then the least-squares estimate of the mean response $E(Y_{i})$ at the level $X=x_{i}$ is ${\hat {Y_{i}}}=b_{0}+b_{1}x_{i}$ . It can then be shown, assuming that the errors independently and identically follow the normal distribution, that an $1-\alpha$ confidence interval of the mean response at a certain level of $X$ is as follows:

{\hat {y}}_{i}\in \left,

where $\left({\frac {1}{n-2}}\sum _{j=1}^{n}e_{j}^{\,2}\right)$ is the mean squared error and $t_{\alpha /2,{\text{df}}=n-2}$ denotes the upper ${\frac {\alpha }{2}}^{\text{th}}$ percentile of Student's t-distribution with $n-2$ degrees of freedom.

However, as multiple mean responses are estimated, the confidence level declines rapidly. To fix the confidence coefficient at $1-\alpha$ , the Working–Hotelling approach employs an F-statistic:

{\hat {y}}_{i}\in \left,

where $W^{2}=2F_{\alpha ,{\text{df}}=(2,n-2)}$ and $F$ denotes the upper $\alpha ^{\text{th}}$ percentile of the F-distribution with $(2,n-2)$ degrees of freedom. The confidence level of is $1-\alpha$ over all values of $X$ , i.e. $x_{i}\in \mathbb {R}$ .

Multiple linear regression

The Working–Hotelling confidence bands can be easily generalised to multiple linear regression. Consider a general linear model as defined in the linear regressions article, that is,

\mathbf {Y} =\mathbf {X} {\boldsymbol {\beta }}+{\boldsymbol {\varepsilon }},\,

where

\mathbf {Y} ={\begin{pmatrix}Y_{1}\\Y_{2}\\\vdots \\Y_{n}\end{pmatrix}},\quad \mathbf {X} ={\begin{pmatrix}\mathbf {x} _{1}^{\rm {T}}\\\mathbf {x} _{2}^{\rm {T}}\\\vdots \\\mathbf {x} _{n}^{\rm {T}}\end{pmatrix}}={\begin{pmatrix}x_{11}&\cdots &x_{1p}\\x_{21}&\cdots &x_{2p}\\\vdots &\ddots &\vdots \\x_{n1}&\cdots &x_{np}\end{pmatrix}},{\boldsymbol {\beta }}={\begin{pmatrix}\beta _{1}\\\beta _{2}\\\vdots \\\beta _{p}\end{pmatrix}},\quad {\boldsymbol {\varepsilon }}={\begin{pmatrix}\varepsilon _{1}\\\varepsilon _{2}\\\vdots \\\varepsilon _{n}\end{pmatrix}}.

Again, it can be shown that the least-squares estimate of the mean response $E(Y_{i})=\mathbf {x} _{i}^{\rm {T}}{\boldsymbol {\beta }}$ is ${\hat {Y}}_{i}=\mathbf {x} _{i}^{\rm {T}}\mathbf {b}$ , where $\mathbf {b}$ consists of least-square estimates of the entries in ${\boldsymbol {\beta }}$ , i.e. $\mathbf {b} =(\mathbf {X} ^{\rm {T}}\mathbf {X} )^{-1}\mathbf {X} ^{\rm {T}}\mathbf {Y}$ . Likewise, it can be shown that a $1-\alpha$ confidence interval for a single mean response estimate is as follows:

{\hat {y}}_{i}\in \left,

where $\operatorname {MSE}$ is the observed value of the mean squared error $(Y^{\rm {T}}Y-\mathbf {b} ^{\rm {T}}X^{\rm {T}}Y)$ .

The Working–Hotelling approach to multiple estimations is similar to that of simple linear regression, with only a change in the degrees of freedom:

{\hat {y}}_{i}\in \left,

where $W^{2}=2F_{\alpha ,{\text{df}}=(p,n-p)}$ .

Graphical representation

In the simple linear regression case, Working–Hotelling–Scheffé confidence bands, drawn by connecting the upper and lower limits of the mean response at every level, take the shape of hyperbolas. In drawing, they are sometimes approximated by the Graybill–Bowden confidence bands, which are linear and hence easier to graph:

\beta _{0}+\beta _{1}(x_{i}-{\bar {x}})\in \left

where $m_{\alpha ,2,{\text{df}}=n-2}$ denotes the upper $\alpha ^{\text{th}}$ percentile of the Studentized maximum modulus distribution with two means and $n-2$ degrees of freedom.

The simple linear regression model with a Working–Hotelling confidence band.

Numerical example

The same data in ordinary least squares are utilised in this example:

Height (m)	1.47	1.50	1.52	1.55	1.57	1.60	1.63	1.65	1.68	1.70	1.73	1.75	1.78	1.80	1.83
Weight (kg)	52.21	53.12	54.48	55.84	57.20	58.57	59.93	61.29	63.11	64.47	66.28	68.10	69.92	72.19	74.46

A simple linear regression model is fit to this data. The values of $b_{0}$ and $b_{1}$ have been found to be −39.06 and 61.27 respectively. The goal is to estimate the mean mass of women given their heights at the 95% confidence level. The value of $W^{2}$ was found to be $F_{0.95,{\text{df}}=(2,15-2)}=2.758828$ . It was also found that ${\bar {x}}=1.651$ , $\sum _{j=1}^{n}e_{j}^{\,2}=7.490558$ , $\operatorname {MSE} =0.5761968$ and $\sum _{j=1}^{n}(x_{j}-{\bar {x}})^{2}=693.3726$ . Then, to predict the mean mass of all women of a particular height, the following Working–Hotelling–Scheffé band has been derived:

{\hat {y}}_{i}\in \left,

which results in the graph on the left.

Comparison with other methods

The Working–Hotelling approach may give tighter or looser confidence limits compared to the Bonferroni correction. In general, for small families of statements, the Bonferroni bounds may be tighter, but when the number of estimated values increases, the Working–Hotelling procedure will yield narrower limits. This is because the confidence level of Working–Hotelling–Scheffé bounds is exactly $1-\alpha$ when all values of the independent variables, i.e. $x_{i}\in \mathbb {R}$ , are considered. Alternatively, from an algebraic perspective, the critical value $\pm {\sqrt {W}}$ remains constant as the number estimates of increases, whereas the corresponding values in Bonferonni estimates, $\pm t_{1-\alpha /g,{\text{df}}=n-p}$ , will be increasingly divergent as the number $g$ of estimates increases. Therefore, the Working–Hotelling method is more suited for large-scale comparisons, whereas Bonferroni is preferred if only a few mean responses are to be estimated. In practice, both methods are usually used first and the narrower interval chosen.

Another alternative to the Working–Hotelling–Scheffé band is the Gavarian band, which is used when a confidence band is needed that maintains equal widths at all levels.

The Working–Hotelling procedure is based on the same principles as Scheffé's method, which gives family confidence intervals for all possible contrasts. Their proofs are almost identical. This is because both methods estimate linear combinations of mean response at all factor levels. However, the Working–Hotelling procedure does not deal with contrasts but with different levels of the independent variable, so there is no requirement that the coefficients of the parameters sum up to zero. Therefore, it has one more degree of freedom.

Footnotes

Miller (1966), p. 1
^ Miller (2014)
^ Neter, Wasserman and Kutner, pp. 163–165
^ Neter, Wasserman and Kutner, pp. 244–245
^ Miller (1966), pp. 123–127
^ Westfall, Tobias and Wolfinger, pp. 277–280

Bibliography

Graybill, Franklin A.; Bowden, David C. (1967-06-01). "Linear Segment Confidence Bands for Simple Linear Models". Journal of the American Statistical Association. 62 (318): 403–408. doi:10.1080/01621459.1967.10482917. ISSN 0162-1459.
Miller, Rupert G. (1966). Simultaneous Statistical Inference. New York: Springer-Verlag. ISBN 978-1-4613-8124-2.
Miller, R. (2014). "Multiple Comparisons I". Encyclopedia of Statistical Sciences. doi:10.1002/0471667196. hdl:11693/51057. ISBN 9780471667193.
Neter, John; Wasserman, William; Kutner, Michael (1990). Applied Linear Statistical Models. Tokyo: Richard D Irwin, Inc. ISBN 978-0-256-08338-5.
Westfall, Peter H; Tobias, R D; Wolfinger, Russell Dean (2011). Multiple comparisons and multiple tests using SAS. Cary, N.C.: SAS Pub. ISBN 9781607648857.
Working, Holbrook; Hotelling, Harold (1929-03-01). "Applications of the Theory of Error to the Interpretation of Trends". Journal of the American Statistical Association. 24 (165A): 73–85. doi:10.1080/01621459.1929.10506274. ISSN 0162-1459.

Least squares and regression analysis

Computational statistics

Correlation and dependence

Regression analysis

Regression as a
statistical model

Linear regression	Simple linear regression Ordinary least squares Generalized least squares Weighted least squares General linear model
Predictor structure	Polynomial regression Growth curve (statistics) Segmented regression Local regression
Non-standard	Nonlinear regression Nonparametric Semiparametric Robust Quantile Isotonic
Non-normal errors	Generalized linear model Binomial Poisson Logistic