Randomization in Experiments: How and Why to Randomize
Before R.A. Fisher formalized randomization in the 1920s, experimenters assigned treatments based on convenience, alternation, or personal judgment. A physician might assign the first patient to the drug group and the next to the control, or assign healthier-looking patients to the experimental treatment. These methods introduce systematic bias because the person making assignments can, consciously or not, steer certain types of participants toward certain groups. Randomization removes this human judgment entirely.
Choose a Randomization Method
Simple randomization is the most basic approach: each participant is independently assigned to a group with a fixed probability, typically 50/50 for a two-group study. It is equivalent to flipping a fair coin for each participant. Simple randomization guarantees unbiased assignment, but it can produce unequal group sizes by chance, especially in small studies. In a study of 20 participants, simple randomization might produce groups of 7 and 13 rather than the intended 10 and 10.
Block randomization solves the balance problem by dividing participants into blocks of a fixed size and randomizing within each block. In a two-group study with a block size of 4, each block contains exactly 2 assignments to Group A and 2 to Group B, in a random order. This guarantees that after every 4th participant, the groups are perfectly balanced. Block sizes should be varied randomly (e.g., alternating between blocks of 4 and 6) to prevent investigators from predicting the next assignment at the end of a block.
Stratified randomization first separates participants into subgroups based on important prognostic factors, then randomizes within each subgroup. If age and disease severity are known to affect the outcome, participants are stratified by these variables (e.g., young/mild, young/severe, old/mild, old/severe) and block randomization is applied within each stratum. This ensures balanced representation of important characteristics across treatment groups, improving the precision of the treatment effect estimate.
Cluster randomization assigns intact groups rather than individuals to treatments. Entire schools, hospitals, or communities are randomized as units. This method is necessary when individual randomization would cause contamination between groups, for example, when testing a school-wide curriculum change where individual students in the same classroom cannot receive different curricula. Cluster randomization requires larger total sample sizes because individuals within clusters are correlated, reducing the effective sample size.
Generate the Allocation Sequence
The allocation sequence must be generated by a verifiable random process. Computer-generated random numbers are the standard method. Most statistical software packages (R, SAS, Stata, SPSS) have built-in functions for generating randomization lists. Online randomization services like Sealed Envelope and Randomization.com produce allocation sequences and maintain audit trails for regulatory compliance.
Pseudorandom number generators used by computers are technically deterministic, but their output is unpredictable enough for experimental purposes as long as the seed is not disclosed. For regulatory or high-stakes trials, true random number generators that use physical processes (atmospheric noise, radioactive decay) provide genuinely unpredictable sequences. In practice, the distinction rarely matters for the validity of the randomization.
Never use alternation (ABAB...), birth dates, medical record numbers, day of the week, or any other predictable pattern as a substitute for randomization. These methods are transparent to anyone involved in enrollment, allowing conscious or unconscious manipulation of which participants receive which treatment. Even well-meaning investigators can be influenced by knowing the next assignment when deciding whether a borderline-eligible patient should be enrolled.
Conceal the Allocation
Allocation concealment means that the person enrolling a participant does not know in advance which group that participant will be assigned to. This is separate from blinding: blinding hides the assignment after it is made, while allocation concealment hides the assignment before it is made. Both are important, and failing at either one can bias the results.
Effective concealment methods include sealed opaque envelopes (each containing a group assignment, opened only after enrollment is finalized), central telephone randomization (where an independent call center reveals the assignment after the participant is registered), and web-based randomization systems that release assignments only after baseline data are entered. The key requirement is that the assignment is irrevocably linked to a specific participant before it is revealed, preventing investigators from changing their enrollment decision based on the assignment.
Studies with inadequate allocation concealment consistently produce larger treatment effect estimates than properly concealed studies. A Cochrane review found that inadequately concealed trials exaggerated effects by an average of 18 percent compared to trials with proper concealment. This bias arises because investigators, even unconsciously, tend to enroll more favorable participants into the active treatment group when they can predict the assignment.
Verify Balance After Randomization
After randomization is complete, check that key baseline characteristics are distributed approximately equally across groups. Create a baseline characteristics table (often called Table 1 in published papers) comparing the groups on age, sex, disease severity, and other relevant variables. Large imbalances suggest a problem with the randomization process, although small imbalances are expected by chance.
Do not use statistical tests to compare baseline characteristics between randomized groups. Any differences are by definition due to chance, so testing whether they are "statistically significant" is logically circular. Instead, visually inspect the distributions and judge whether any imbalances are large enough to affect the interpretation of results. If a meaningful imbalance exists, it can be adjusted for in the analysis using covariate adjustment.
Common Randomization Mistakes
Quasi-random methods like alternation or assignment by date are not true randomization and do not protect against selection bias. Any system where the next assignment can be predicted is vulnerable to manipulation. Researchers using alternation have been caught holding difficult patients until a control slot comes up, or rushing easy patients through enrollment to fill the treatment group.
Post-randomization exclusion, removing participants from the analysis after randomization because they did not comply with the treatment or had unexpected baseline characteristics, undermines the protection that randomization provides. The intent-to-treat principle requires that all randomized participants be included in the primary analysis, regardless of what happened after randomization. Excluding non-compliant participants reintroduces the selection bias that randomization was designed to eliminate.
Re-randomizing participants who were lost to follow-up or who violated the protocol is never appropriate. Once a participant is randomized, their assignment is permanent for the purposes of the primary analysis. Replacing them with new participants maintains the sample size but does not restore the randomization balance, because the reasons for loss to follow-up may be related to the treatment assignment.
True randomization requires a chance mechanism, concealed allocation, and commitment to analyzing all randomized participants regardless of what happens after assignment. No alternative method provides the same protection against bias.