From Clinical Trials to A/B Tests: How Randomization Became the Gold Standard for Truth

4 min read

In 1747, a Scottish naval surgeon named James Lind conducted what is often cited as the first controlled clinical trial. Aboard the HMS Salisbury, he divided twelve sailors suffering from scurvy into six pairs and gave each pair a different treatment: cider, sulfuric acid, vinegar, seawater, oranges and lemons, or a purgative mixture. The citrus group recovered. The others didn't. It was a landmark moment in medical reasoning, but it wasn't a randomized experiment. Lind chose which sailors got which treatment, and with only twelve subjects, his own biases in assignment — conscious or not — could have influenced the outcome. If he gave the citrus to the two sailors who seemed healthiest, the result proves less than it appears.

It took almost two hundred years for medicine to solve this problem. The first properly randomized clinical trial is generally attributed to the 1948 streptomycin trial for tuberculosis, organized by the British Medical Research Council under the direction of statistician Austin Bradford Hill. Hill's key innovation was simple but radical: patients were assigned to the treatment or control group using random numbers generated from statistical tables, and the clinicians making the assignment didn't know in advance which group the next patient would enter. This eliminated selection bias — the tendency of researchers, however well-intentioned, to steer certain patients toward the treatment they believed would help them.

The logic is worth sitting with because it's counterintuitive. You might think that a knowledgeable doctor assigning patients to groups based on their judgment would produce better results than random assignment. After all, the doctor knows things about each patient that a random number doesn't. But that knowledge is precisely the problem. If the doctor unconsciously assigns sicker patients to the control group and healthier patients to the treatment group, the treatment will look more effective than it is. If the doctor does the reverse, it will look less effective. The doctor's knowledge introduces a confounding variable that makes it impossible to attribute the outcome to the treatment alone. Random assignment doesn't use knowledge of the patients, and that's what makes it powerful — it ensures that every other variable, known and unknown, is distributed equally between groups on average, so that any observed difference can be attributed to the treatment with confidence.

This principle — that deliberate ignorance in assignment produces more reliable knowledge than informed judgment — is one of the most important insights in the history of science. It underpins every drug approval, every vaccine trial, and every evidence-based medical guideline in use today. When public health authorities say a treatment "works," they mean it outperformed a control in a randomized trial, and the randomization is what gives that claim its authority.

The tech industry imported the same framework under a different name: the A/B test. When a software company wants to know whether a new button color, page layout, or pricing structure will improve conversion rates, they randomly assign users to see either the current version (A) or the new version (B) and measure the difference in outcomes. The randomization ensures that the two groups are comparable — same mix of device types, geographies, times of day, user demographics — so that any difference in behavior can be attributed to the change rather than to a confounding variable. It's the streptomycin trial applied to button colors, and the statistical logic is identical.

What's remarkable is how recently this methodology became standard practice. For most of human history, people evaluated treatments, policies, and decisions by expert opinion, anecdote, or before-and-after comparison — all of which are vulnerable to the biases that randomization eliminates. The idea that you could learn more by introducing chance into your investigation than by applying all available knowledge was genuinely revolutionary, and it remains one of the strongest arguments for taking randomness seriously as a tool rather than dismissing it as the absence of information.

The applications extend beyond medicine and software. Randomized evaluation has transformed development economics, where researchers randomly assign villages or households to receive an intervention (microloans, textbooks, mosquito nets) and compare outcomes to a control group. It's reshaped criminal justice research, education policy, and behavioral science. In every field where it's been adopted, randomized experimentation has overturned confident expert beliefs — treatments that "obviously" worked turned out to be ineffective, interventions that seemed unlikely turned out to be transformative — because the randomization stripped away the biases that had been propping up the old conclusions.

Randomness, it turns out, isn't just a party trick or a decision-making shortcut. It's one of the most powerful tools humans have ever developed for figuring out what's actually true.

Related Posts