Misplaced Pages

Credible interval

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Credible region) Concept in Bayesian statistics
The highest-density 90% credible interval of a posterior probability distribution
Part of a series on
Bayesian statistics
Posterior = Likelihood × Prior ÷ Evidence
Background
Model building
Posterior approximation
Estimators
Evidence approximation
Model evaluation

In Bayesian statistics, a credible interval is an interval used to characterize a probability distribution. It is defined such that an unobserved parameter value has a particular probability α {\displaystyle \alpha } to fall within it. For example, in an experiment that determines the distribution of possible values of the parameter μ {\displaystyle \mu } , if the probability that μ {\displaystyle \mu } lies between 35 and 45 is α = 0.95 {\displaystyle \alpha =0.95} , then 35 μ 45 {\displaystyle 35\leq \mu \leq 45} is a 95% credible interval.

Credible intervals are typically used to characterize posterior probability distributions or predictive probability distributions. Their generalization to disconnected or multivariate sets is called credible region.

Credible intervals are a Bayesian analog to confidence intervals in frequentist statistics. The two concepts arise from different philosophies: Bayesian intervals treat their bounds as fixed and the estimated parameter as a random variable, whereas frequentist confidence intervals treat their bounds as random variables and the parameter as a fixed value. Also, Bayesian credible intervals use (and indeed, require) knowledge of the situation-specific prior distribution, while the frequentist confidence intervals do not.

Definitions

Credible regions are not unique; any given probability distribution has an infinite number of credible regions of probability α {\displaystyle \alpha } . For example, in the univariate case, there are multiple definitions for a suitable interval or region:

  • The smallest interval, sometimes called the highest density interval (HDI). This interval will necessarily include the median whenever α 0.5 {\displaystyle \alpha \geq 0.5} . Besides, when the distribution is unimodal, this interval will include the mode.
  • The smallest region, sometimes called the highest density region (HDR). For a multimodal distribution, this is not necessarily an interval as it can be disconnected. This region will always include the mode.
  • A quantile-based interval (QBI), which are computed by taking the inter-quantile interval [ q β , q β + α ] {\displaystyle } for some β [ 0 , 1 α ] {\displaystyle \beta \in } . For instance, the median interval of probability α {\displaystyle \alpha } is the interval where the probability of being below the interval is as likely as being above it, that is to say the interval [ q ( 1 α ) / 2 , q ( 1 + α ) / 2 ] {\displaystyle } . It is sometimes also called the equal-tailed interval, and it will always include the median. Many other QBIs can be defined, such as the lowest interval [ q 0 , q α ] {\displaystyle } , or the highest interval [ q 1 α , q 1 ] {\displaystyle } . These intervals may be more suited for bounded variables.

One may define the interval for which the mean is the central point, assuming that the mean exists.

HDR can easily be generalized to the multivariate case, and are bounded by probability density contour lines. They will always contain the mode, but not necessarily the mean, the coordinate-wise median, nor the geometric median.

Credible intervals can also be estimated through the use of simulation techniques such as Markov chain Monte Carlo.

Contrasts with confidence interval

See also: Confidence interval § Credible interval

A frequentist 95% confidence interval means that with a large number of repeated samples, 95% of such calculated confidence intervals would include the true value of the parameter. In frequentist terms, the parameter is fixed (cannot be considered to have a distribution of possible values) and the confidence interval is random (as it depends on the random sample).

Bayesian credible intervals differ from frequentist confidence intervals by two major aspects:

  • credible intervals are intervals whose values have a (posterior) probability density, representing the plausibility that the parameter has those values, whereas confidence intervals regard the population parameter as fixed and therefore not the object of probability. Within confidence intervals, confidence refers to the randomness of the very confidence interval under repeated trials, whereas credible intervals analyse the uncertainty of the target parameter given the data at hand.
  • credible intervals and confidence intervals treat nuisance parameters in radically different ways.

For the case of a single parameter and data that can be summarised in a single sufficient statistic, it can be shown that the credible interval and the confidence interval coincide if the unknown parameter is a location parameter (i.e. the forward probability function has the form P r ( x | μ ) = f ( x μ ) {\displaystyle \mathrm {Pr} (x|\mu )=f(x-\mu )} ), with a prior that is a uniform flat distribution; and also if the unknown parameter is a scale parameter (i.e. the forward probability function has the form P r ( x | s ) = f ( x / s ) {\displaystyle \mathrm {Pr} (x|s)=f(x/s)} ), with a Jeffreys' prior   P r ( s | I ) 1 / s {\displaystyle \mathrm {Pr} (s|I)\;\propto \;1/s} — the latter following because taking the logarithm of such a scale parameter turns it into a location parameter with a uniform distribution. But these are distinctly special (albeit important) cases; in general no such equivalence can be made.

References

  1. Edwards, Ward; Lindman, Harold; Savage, Leonard J. (1963). "Bayesian statistical inference in psychological research". Psychological Review. 70 (3): 193–242. doi:10.1037/h0044139.
  2. Lee, P.M. (1997) Bayesian Statistics: An Introduction, Arnold. ISBN 0-340-67785-6
  3. VanderPlas, Jake. "Frequentism and Bayesianism III: Confidence, Credibility, and why Frequentism and Science do not Mix | Pythonic Perambulations". jakevdp.github.io.
  4. O'Hagan, A. (1994) Kendall's Advanced Theory of Statistics, Vol 2B, Bayesian Inference, Section 2.51. Arnold, ISBN 0-340-52922-9
  5. Chen, Ming-Hui; Shao, Qi-Man (1 March 1999). "Monte Carlo Estimation of Bayesian Credible and HPD Intervals". Journal of Computational and Graphical Statistics. 8 (1): 69–92. doi:10.1080/10618600.1999.10474802.
  6. ^ Jaynes, E. T. (1976). "Confidence Intervals vs Bayesian Intervals", in Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science, (W. L. Harper and C. A. Hooker, eds.), Dordrecht: D. Reidel, pp. 175 et seq

Further reading

  • Bolstad, William M.; Curran, James M. (2016). "Comparing Bayesian and Frequentist Inferences for Mean". Introduction to Bayesian Statistics (Third ed.). John Wiley & Sons. pp. 237–253. ISBN 978-1-118-09156-2.
Statistics
Descriptive statistics
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Data collection
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical inference
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical / Multivariate / Time-series / Survival analysis
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Applications
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Categories: