Completely randomized design

In the design of experiments, completely randomized designs are for studying the effects of one primary factor without the need to take other nuisance variables into account. This article describes completely randomized designs that have one primary factor. The experiment compares the values of a response variable based on the different levels of that primary factor. For completely randomized designs, the levels of the primary factor are randomly assigned to the experimental units.

Randomization

To randomize is to determine the run sequence of the experimental units randomly. For example, if there are 3 levels of the primary factor with each level to be run 2 times, then there are 6! (where ! denotes factorial) possible run sequences (or ways to order the experimental trials). Because of the replication, the number of unique orderings is 90 (since 90 = 6!/(2!*2!*2!)). An example of an unrandomized design would be to always run 2 replications for the first level, then 2 for the second level, and finally 2 for the third level. To randomize the runs, one way would be to put 6 slips of paper in a box with 2 having level 1, 2 having level 2, and 2 having level 3. Before each run, one of the slips would be drawn blindly from the box and the level selected would be used for the next run of the experiment.

In practice, the randomization is typically performed by a computer program. However, the randomization can also be generated from random number tables or by some physical mechanism (e.g., drawing the slips of paper).

Three key numbers

All completely randomized designs with one primary factor are defined by 3 numbers:

k = number of factors (= 1 for these designs)
L = number of levels
n = number of replications

and the total sample size (number of runs) is N = k × L × n. Balance dictates that the number of replications be the same at each level of the factor (this will maximize the sensitivity of subsequent statistical t- (or F-) tests).

Example

A typical example of a completely randomized design is the following:

k = 1 factor (X₁)
L = 4 levels of that single factor (called "1", "2", "3", and "4")
n = 3 replications per level
N = 4 levels × 3 replications per level = 12 runs

Sample randomized sequence of trials

The randomized sequence of trials might look like: X₁: 3, 1, 4, 2, 2, 1, 3, 4, 1, 2, 4, 3

Note that in this example there are 12!/(3!*3!*3!*3!) = 369,600 ways to run the experiment, all equally likely to be picked by a randomization procedure.

Model for a completely randomized design

The model for the response is $Y_{i,j}=\mu +T_{i}+\mathrm {random\ error}$

with

Y_i,j being any observation for which X₁ = i (i and j denote the level of the factor and the replication within the level of the factor, respectively)
μ (or mu) is the general location parameter
T_i is the effect of having treatment level i

Estimates and statistical tests

Estimating and testing model factor levels

Estimate for μ : ${\bar {Y}}$ = the average of all the data
Estimate for T_i : ${\bar {Y}}_{i}-{\bar {Y}}$

with ${\bar {Y}}_{i}$ = average of all Y for which X₁ = i.

Statistical tests for levels of X₁ are those used for a one-way ANOVA and are detailed in the article on analysis of variance.

Bibliography

Caliński, Tadeusz; Kageyama, Sanpei (2000). Block designs: A Randomization approach, Volume I: Analysis. Lecture Notes in Statistics. Vol. 150. New York: Springer-Verlag. ISBN 0-387-98578-6.

Christensen, Ronald (2002). Plane Answers to Complex Questions: The Theory of Linear Models (Third ed.). New York: Springer. ISBN 0-387-95361-2.

Kempthorne, Oscar (1979). The Design and Analysis of Experiments (Corrected reprint of (1952) Wiley ed.). Robert E. Krieger. ISBN 0-88275-105-0.

Hinkelmann, Klaus and Kempthorne, Oscar (2008). Design and Analysis of Experiments. Vol. I and II (Second ed.). Wiley. ISBN 978-0-470-38551-7.{{cite book}}: CS1 maint: multiple names: authors list (link)
- Hinkelmann, Klaus and Kempthorne, Oscar (2008). Design and Analysis of Experiments, Volume I: Introduction to Experimental Design (Second ed.). Wiley. ISBN 978-0-471-72756-9.{{cite book}}: CS1 maint: multiple names: authors list (link)
- Hinkelmann, Klaus and Kempthorne, Oscar (2005). Design and Analysis of Experiments, Volume 2: Advanced Experimental Design (First ed.). Wiley. ISBN 978-0-471-55177-5.{{cite book}}: CS1 maint: multiple names: authors list (link)

External links

v t e Design of experiments
Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs Completely randomized	Factorial Fractional factorial Plackett–Burman Taguchi Response surface methodology Polynomial and rational modeling Box–Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
Glossary Category Mathematics portal Statistical outline Statistical topics

Statistics

Descriptive statistics

Continuous data

Center	Mean Arithmetic Arithmetic-Geometric Contraharmonic Cubic Generalized/power Geometric Harmonic Heronian Heinz Lehmer Median Mode
Dispersion	Average absolute deviation Coefficient of variation Interquartile range Percentile Range Standard deviation Variance
Shape	Central limit theorem Moments Kurtosis L-moments Skewness

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Effect size Missing data Optimal design Population Replication Sample size determination Statistic Statistical power
Survey methodology	Sampling Cluster Stratified Opinion poll Questionnaire Standard error
Controlled experiments	Blocking Factorial experiment Interaction Random assignment Randomized controlled trial Randomized experiment Scientific control
Adaptive designs	Adaptive clinical trial Stochastic approximation Up-and-down designs
Observational studies	Cohort study Cross-sectional study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in
Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife
Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons
Parametric tests	Likelihood-ratio Score/Lagrange multiplier Wald

Specific tests

Z-test (normal) Student's t-test F-test
Goodness of fit	Chi-squared G-test Kolmogorov–Smirnov Anderson–Darling Lilliefors Jarque–Bera Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC
Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra) Van der Waerden test

Bayesian inference

Correlation	Pearson product-moment Partial correlation Confounding variable Coefficient of determination
Regression analysis	Errors and residuals Regression validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)
Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression
Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Homoscedasticity and Heteroscedasticity
Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions
Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality
Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey
Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)
Frequency domain	Spectral density estimation Fourier analysis Least-squares spectral analysis Wavelet Whittle likelihood

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time
Hazard function	Nelson–Aalen estimator
Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics
Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification
Social statistics	Actuarial science Census Crime statistics Demography Econometrics Jurimetrics National accounts Official statistics Population statistics Psychometrics
Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging