Misplaced Pages

Dixon's Q test

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Criterion for identification and rejection of outliers
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Dixon's Q test" – news · newspapers · books · scholar · JSTOR (May 2015) (Learn how and when to remove this message)

In statistics, Dixon's Q test, or simply the Q test, is used for identification and rejection of outliers. This assumes normal distribution and per Robert Dean and Wilfrid Dixon, and others, this test should be used sparingly and never more than once in a data set. To apply a Q test for bad data, arrange the data in order of increasing values and calculate Q as defined:

Q = gap range {\displaystyle Q={\frac {\text{gap}}{\text{range}}}}

Where gap is the absolute difference between the outlier in question and the closest number to it. If Q > Qtable, where Qtable is a reference value corresponding to the sample size and confidence level, then reject the questionable point. Note that only one point may be rejected from a data set using a Q test.

Example

Consider the data set:

0.189 ,   0.167 ,   0.187 ,   0.183 ,   0.186 ,   0.182 ,   0.181 ,   0.184 ,   0.181 ,   0.177 {\displaystyle 0.189,\ 0.167,\ 0.187,\ 0.183,\ 0.186,\ 0.182,\ 0.181,\ 0.184,\ 0.181,\ 0.177\,}

Now rearrange in increasing order:

0.167 ,   0.177 ,   0.181 ,   0.181 ,   0.182 ,   0.183 ,   0.184 ,   0.186 ,   0.187 ,   0.189 {\displaystyle 0.167,\ 0.177,\ 0.181,\ 0.181,\ 0.182,\ 0.183,\ 0.184,\ 0.186,\ 0.187,\ 0.189\,}

We hypothesize that 0.167 is an outlier. Calculate Q:

Q = gap range = | 0.177 0.167 | 0.189 0.167 = 0.455. {\displaystyle Q={\frac {\text{gap}}{\text{range}}}={\frac {|0.177-0.167|}{0.189-0.167}}=0.455.}

With 10 observations and at 90% confidence, Q = 0.455 > 0.412 = Qtable, so we conclude 0.167 is indeed an outlier. However, at 95% confidence, Q = 0.455 < 0.466 = Qtable 0.167 is not considered an outlier.

McBane notes: Dixon provided related tests intended to search for more than one outlier, but they are much less frequently used than the r10 or Q version that is intended to eliminate a single outlier.

Table

This table summarizes the limit values of the two-tailed Dixon's Q test.

Number of values:  3
4
5
6
7
8
9
10
Q90%:
0.941
0.765
0.642
0.560
0.507
0.468
0.437
0.412
Q95%:
0.970
0.829
0.710
0.625
0.568
0.526
0.493
0.466
Q99%:
0.994
0.926
0.821
0.740
0.680
0.634
0.598
0.568

See also

References

  1. Halpern, Arthur M. "Experimental physical chemistry : a laboratory textbook." 3rd ed. / Arthur M. Halpern, George C. McBane. New York : W. H. Freeman, c2006 Library of Congress

Further reading

  • Robert B. Dean and Wilfrid J. Dixon (1951) "Simplified Statistics for Small Numbers of Observations". Anal. Chem., 1951, 23 (4), 636–638. Abstract Full text PDF Archived 2015-05-01 at the Wayback Machine
  • Rorabacher, D. B. (1991) "Statistical Treatment for Rejection of Deviant Values: Critical Values of Dixon Q Parameter and Related Subrange Ratios at the 95 percent Confidence Level". Anal. Chem., 63 (2), 139–146. PDF (including larger tables of limit values)
  • McBane, George C. (2006) "Programs to Compute Distribution Functions and Critical Values for Extreme Value Ratios for Outlier Detection". J. Statistical Software 16(3):1–9, 2006 Article (PDF) and Software (Fortan-90, Zipfile)
  • Shivanshu Shrivastava, A. Rajesh, P. K. Bora (2014) "Sliding window Dixon's tests for malicious users' suppression in a cooperative spectrum sensing system" IET Communications, 2014, 8 (7)
  • W. J. Dixon. The Annals of Mathematical Statistics. Vol. 21, No. 4 (Dec., 1950), pp. 488–506 doi:10.1214/aoms/1177729747

External links

Categories: