Bias (statistics): Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 19:46, 21 November 2021 editC.Hua Wang (talk \| contribs)57 editsm Changed the word "error" to "bias".Tag: Visual edit ← Previous edit		Revision as of 13:11, 9 December 2021 edit undo178.3.231.144 (talk) Rewrite the (frankly terrible) introduction a bit. This article is still confused and doesn't know whether it wants to talk about bias of estimators, bias of tests, or statistical bias in general, though it leans heavily toward the first.Next edit →
Line 5:		Line 5:
	{{more citations needed\|date=June 2012}}		{{more citations needed\|date=June 2012}}
	}}		}}
	'''Statistical bias''' is a feature of a ] technique or of its results whereby the ] of the results differs from the true underlying quantitative ] being ]. The bias of an estimator of a parameter should not be confused with its degree of precision as the degree of precision is a measure of the sampling error. ~~Mathematically bias can be defined as:~~		'''Statistical bias''' is a feature of a ] technique or of its results whereby the ] of the results differs from the true underlying quantitative ] being ]. The bias of an estimator of a parameter should not be confused with its degree of precision, as the degree of precision is a measure of the sampling error.

	:~~Let~~ <math>T</math> be a statistic used to estimate a parameter <math>\theta</math>. If ~~<math>\operatorname E(T)=\theta + \operatorname{bias}(\theta)</math> then <math>\operatorname{bias}(\theta)</math> is called the bias of the statistic <math>T</math>, where~~ <math>\operatorname E(T) </math> ~~represents~~ the expected value of ~~the statistics~~ <math>T</math>. ~~If <math>\operatorname{bias}(\theta)=0</math>~~, ~~then <math>\operatorname E(T)=\theta</math>. So, <math>T</math> is an unbiased estimator of the true parameter, say <math>\theta</math>.~~		Mathematically, the bias is defined as follows: let <math>T</math> be a statistic used to estimate a parameter <math>\theta</math>, and let <math>\operatorname E(T)</math> denote the expected value of <math>T</math>. Then,

			:<math>\operatorname{bias}(T, \theta) = \operatorname{bias}(T) = \operatorname E(T) - \theta</math>

			is called the bias of the statistic <math>T</math> (with respect to <math>\theta</math>). If <math>\operatorname{bias}(T, \theta)=0</math>, then <math>T</math> is said to be an '''unbiased estimator''' of <math>\theta</math>; otherwise, it is said to be a '''biased estimator''' of <math>\theta</math>.

			There is no universally-accepted standard notation for the bias; commonly it is denoted by <math>\operatorname{bias}</math>, <math>\operatorname{Bias}</math> or <math>\operatorname{BIAS}</math>. The bias of a statistic <math>T</math> is always relative to the parameter <math>\theta</math> it is used to estimate, but the parameter <math>\theta</math> is often omitted when it is clear from the context what is being estimated.

	== Introduction ==		== Introduction ==

Revision as of 13:11, 9 December 2021

Situation where the mean of many measurements differs significantly from the actual value

This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)

This article's lead section may be too short to adequately summarize the key points. Please consider expanding the lead to provide an accessible overview of all important aspects of the article. (October 2017)

The examples and perspective in this article may not include all significant viewpoints. Please improve the article or discuss the issue. (October 2017) (Learn how and when to remove this message)

This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Bias" statistics – news · newspapers · books · scholar · JSTOR (June 2012) (Learn how and when to remove this message)

(Learn how and when to remove this message)

Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated. The bias of an estimator of a parameter should not be confused with its degree of precision, as the degree of precision is a measure of the sampling error.

Mathematically, the bias is defined as follows: let $T$ be a statistic used to estimate a parameter $\theta$ , and let $\operatorname {E} (T)$ denote the expected value of $T$ . Then,

\operatorname {bias} (T,\theta )=\operatorname {bias} (T)=\operatorname {E} (T)-\theta

is called the bias of the statistic $T$ (with respect to $\theta$ ). If $\operatorname {bias} (T,\theta )=0$ , then $T$ is said to be an unbiased estimator of $\theta$ ; otherwise, it is said to be a biased estimator of $\theta$ .

There is no universally-accepted standard notation for the bias; commonly it is denoted by $\operatorname {bias}$ , $\operatorname {Bias}$ or $\operatorname {BIAS}$ . The bias of a statistic $T$ is always relative to the parameter $\theta$ it is used to estimate, but the parameter $\theta$ is often omitted when it is clear from the context what is being estimated.

Introduction

When we make any measurement, there will be bias, and sometimes these bias will have a serious impact on our results. For example, to investigate the buying habits of the people. If the sample size is not large enough, the results may not be representative of the buying habits of all the people. That is, there may be discrepancies between the survey results and the actual results. Therefore, understanding the source of statistical bias allows us to assess whether our results are close to the real results.

Types

A statistic is biased if it is calculated in such a way that it is systematically different from the population parameter being estimated. The following lists some types of biases, which can overlap.

Selection bias involves individuals being more likely to be selected for study than others, biasing the sample. This can also be termed sampling bias and Berksonian bias.
- Spectrum bias arises from evaluating diagnostic tests on biased patient samples, leading to an overestimate of the sensitivity and specificity of the test.
The bias of an estimator is the difference between an estimator's expected value and the true value of the parameter being estimated.
- Omitted-variable bias is the bias that appears in estimates of parameters in regression analysis when the assumed specification omits an independent variable that should be in the model.
In statistical hypothesis testing, a test is said to be unbiased if, for some alpha level (between 0 and 1), the probability the null is rejected is less than or equal to the alpha level for the entire parameter space defined by the null hypothesis, while the probability the null is rejected is greater than or equal to the alpha level for the entire parameter space defined by the alternative hypothesis.
Detection bias occurs when a phenomenon is more likely to be observed for a particular set of study subjects. For instance, the syndemic involving obesity and diabetes may mean doctors are more likely to look for diabetes in obese patients than in thinner patients, leading to an inflation in diabetes among obese patients because of skewed detection efforts.
In educational measurement, bias is defined as "Systematic errors in test content, test administration, and/or scoring procedures that can cause some test takers to get either lower or higher scores than their true ability would merit. The source of the bias is irrelevant to the trait the test is intended to measure."
Funding bias may lead to the selection of outcomes, test samples, or test procedures that favor a study's financial sponsor.
Reporting bias involves a skew in the availability of data, such that observations of a certain kind are more likely to be reported.
Analytical bias arises due to the way that the results are evaluated.
Exclusion bias arise due to the systematic exclusion of certain individuals from the study.
Attrition bias arises due to a loss of participants e.g. loss to follow up during a study.
Recall bias arises due to differences in the accuracy or completeness of participant recollections of past events. e.g. patients cannot recall how many cigarettes they smoked last week exactly, leading to over-estimation or under-estimation.
Observer bias arises when the researcher subconsciously influences the experiment due to cognitive bias where judgment may alter how an experiment is carried out / how results are recorded.

References

Rothman, Kenneth J.; Greenland, Sander; Lash, Timothy L. (2008). Modern Epidemiology. Lippincott Williams & Wilkins. pp. 134–137.
Neyman, Jerzy; Pearson, Egon S. (1936). "Contributions to the theory of testing statistical hypotheses". Statistical Research Memoirs. 1: 1–37.
National Council on Measurement in Education (NCME). "NCME Assessment Glossary". Archived from the original on 2017-07-22.
Higgins, Julian P. T.; Green, Sally (March 2011). "8. Introduction to sources of bias in clinical trials". In Higgins, Julian P. T.; et al. (eds.). Cochrane Handbook for Systematic Reviews of Interventions (version 5.1). The Cochrane Collaboration.

v t e Biases
Cognitive biases	Acquiescence Ambiguity Affinity Anchoring Attentional Attribution Actor–observer Correspondence Authority Automation Availability Mean world Belief Blind spot Choice-supportive Commitment Confirmation Selective perception Compassion fade Congruence Cultural Declinism Distinction Dunning–Kruger Egocentric Curse of knowledge Emotional Extrinsic incentives Fading affect Framing Frequency Frog pond effect Halo effect Hindsight Horn effect Hostile attribution Impact Implicit In-group Intentionality Illusion of transparency Mean world syndrome Mere-exposure effect Narrative Negativity Normalcy Omission Optimism Out-group homogeneity Outcome Overton window Precision Present Pro-innovation Proximity Response Restraint Self-serving Social comparison Social influence bias Spotlight Status quo Substitution Time-saving Trait ascription Turkey illusion von Restorff effect Zero-risk In animals
Statistical biases	Estimator Forecast Healthy user Information Psychological Lead time Length time Non-response Observer Omitted-variable Participation Recall Sampling Selection Self-selection Social desirability Spectrum Survivorship Systematic error Systemic Verification Wet
Other biases	Academic Basking in reflected glory Déformation professionnelle Funding FUTON Inductive Infrastructure Inherent In education Liking gap Media False balance Vietnam War Norway South Asia Sweden United States Arab–Israeli conflict Ukraine Net Political bias Publication Reporting White hat
Bias reduction	Cognitive bias mitigation Debiasing Heuristics in judgment and decision-making
Lists: General Memory

Categories:

Misplaced Pages