Primer - How to Run Experiments

A/B Test Planning Document - How to Run Experiments

As product folks, we often hear "let's run an A/B test" used excessively or lightly without giving too much thought of how it needs be run. Often we could run into issues, failed attempts and wasting precious hours of our designers and tech partners, if we don't have clarity in conducting an experiment. However, successful A/B testing requires methodical planning, clear communication, and rigorous execution. In this primer, I'll list down essentials to ensures your experiments deliver meaningful insights and drive product success.

1. Establishing Clear Goals and Purpose

Before diving into any A/B test, ask yourself: What specific metric are we trying to move? Your goal should align with key business metrics like:

AARRR funnel optimization
Cart conversion rates
Free-to-paid conversion rates
User engagement metrics

Pro tip: If you can't articulate the goal in a single sentence, you're not ready to run the test.

Note - in cases of B2B, the above would differ depending on the business and customers.

2. Crafting a Strong Hypothesis

Your hypothesis should follow this structure: "We believe that [change] will result in [outcome] because [reasoning]."

Example hypotheses:

"Displaying discounted prices before payment options will increase conversion rates because it reduces friction in the decision-making process."
"Implementing auto-renewal for paid plans will increase subscription retention because it removes the manual renewal barrier."

3. Understanding Your User Segments

This critical step often gets overlooked. You need to:

Identify homogeneous user groups
Analyze behavioral patterns
Ensure statistical significance

Key consideration: If your homogeneous sample set is smaller than 100 users, validate whether these users drive significant impact on your target metrics. If they do, proceed with the test. If not, refine your user segments until you have statistically significant cohorts.

Further reading - A refresher on statistical significance

4. Setting the Experimental Timeline

Every A/B test needs a clear timeline. Consider:

Minimum duration needed for statistical significance
Seasonal factors that might affect results
Business cycles and peak periods
Time needed for user behavior patterns to emerge

Best practice: Most tests should run for 2-4 weeks, but this can vary based on your user volume and conversion cycles.

5. Risk Assessment and Mitigation

Document your riskiest assumptions and establish guardrails:

Define acceptable performance ranges
Set clear abort criteria
Plan mitigation strategies

Example: If your current subscription growth is 20%, you might set:

Minimum acceptable performance: 15% (5% dip)
Abort criteria: Below 15% for three consecutive days
Mitigation plan: Ready-to-deploy rollback strategy

6. Release Strategy

Your rollout plan should address:

Phasing strategy (percentage of users, geographic regions, etc.)
Timing considerations (peak vs. off-peak hours)
User selection criteria
Technical implementation approach

Key question: What's the smallest subset of users that will give you statistically significant results?

7. Cross-functional Dependencies

Document and align on:

Technical requirements
Design dependencies
Customer support preparation
Legal/compliance considerations
Marketing coordination needs

8. RACI and Communication Protocol

Create a clear responsibility matrix:

Responsible: Who's executing the test?
Accountable: Who owns the results?
Consulted: Which stakeholders need to provide input?
Informed: Who needs to know about progress?

9. Measurement Infrastructure

Before launching:

Set up dedicated dashboards
Prepare SQL queries for data analysis
Configure tracking parameters
Test your measurement tools

10. Launch and Monitoring

During the test:

Daily monitoring schedule
Stakeholder update cadence
Alert thresholds
Emergency response protocol

11. Post-Test Analysis and Next Steps

Document:

Test results with statistical significance
Learnings and insights
Unexpected observations
Recommendations for next steps

If successful:

Plan for broader rollout
Document scalability considerations
Prepare extended implementation timeline

If unsuccessful:

Document learnings
Identify potential iterations
Update hypothesis for future tests

Conclusion

There is no one way to go about experimentation. It varies across domains, regulation, industry, size of the company, scope of outcome, culture and a few more. Irrespective of these variables, most of us, don't drill down the basics leading to diluted experience and insignificant outcomes. This post aims to aid structured thinking, gaining clarity before execution, esp from points 1-9. While there is so much ambiquity in product work, let's try to minimize what's in our control.