Sequential Testing: The Secret Sauce for Low-Volume A/B Tests

Author:Murphy  |  View: 21064  |  Time: 2025-03-23 11:41:45

How to accelerate decision-making and improve accuracy when dealing with limited data

Image generated by OpenAI's chatGPT

What is A/B Testing and Why is it Hard?

A/B testing is a simple way to reduce uncertainty in decision making by providing a data driven way to determine which version of a product is more effective. The concept of A/B testing is simple.

  • Imagine you are at a friend's birthday party. You've been painstakingly working on perfecting your cookie recipe. You think you've perfected it, but you don't know if people will prefer the cookie with or without oats. In your opinion, oats give the cookie a nice chewy texture. However, you're not sure if this is a mass opinion or just your individual preference.
  • You end up showing up to the party with two different versions of the cookie, cookie A has oats and cookie B doesn't. You randomly give half of your friends cookie A, and the other have get cookie B.
  • You decide that the cookie that gets more "yums" is the better cookie.
  • Once everyone has tasted the cookie, you find that cookie B got more "yums" and conclude that is the better cookie.

This process of randomly distributing cookies to party guests and monitoring their feedback is an example of an A/B test.

In the world of technology, A/B testing provides a data driven way to determine which version of a product is more effective. By randomly routing users to different versions of an experience, you can empirically measure the impact of different product versions on key performance metrics. This allows you to validate changes, and iteratively optimize product offerings.

In my role as a senior Data Science manager, we most commonly use A/B testing to test different pricing models to see which model leads to the most purchases. Consider two pricing strategies – one where the product is priced at $19.99 and the other is priced at $24.99 with a 20% discount. These two pricing strategies lead to the same price, but are customers more likely to purchase if they see a 20% discount? We can test this using A/B testing!

Traditional A/B tests typically require a certain amount of samples before you can conclude that one version of the product or model is better than others. In other words, traditional A/B tests require enough samples so that the test itself can be considered statistically significant. The number of samples required to achieve statistical significance on an A/B test is set before the experiment begins, and then you wait. This is referred to as fixed sample size A/B testing.

Fixed sample size A/B testing is problematic for a plethora of reasons.

  1. Time Intensive: In large companies with huge volumes, you may reach your desired sample size quickly. However, if you're like me, and work in a small startup where volume isn't as large – waiting for the test to finish can be time intensive. Recently, my team designed an A/B test only to realize that it would take us 2 years to reach the desired sample size!
  2. Inflexibility: Once you've established the required sample size for your A/B test, you're locked into that decision. If external factors change you can't easily adjust the test without compromising the test.

What is sequential Testing and why is it (maybe) Easier?

Sequential testing is a version of A/B testing that allows for continuous monitoring of data as it is collected, enabling decisions to be made earlier than in traditional fixed-sample tests. By using predefined stopping rules, you can stop the test as soon as sufficient evidence is gathered.

Sequential testing is an alternative to fixed sample size testing. It's commonly used in situations where there are:

  • Low volumes: When you have limited data coming in and need to make decisions quickly, sequential testing allows you to draw conclusions without waiting for a large sample size.
  • Cost or time constraints: If the cost or time to collect data is high, sequential testing can help reduce the number of samples needed by allowing the test to stop as soon as a clear result is observed.
  • Adaptive factors: When conditions or user behavior might change over time, sequential testing allows for more flexible decision-making and adaptation as new data is collected.

How does Sequential Testing Work?

Implementing sequential testing relies on the Sequential Probability Ratio Test ("SPRT"). This ratio is used to test two competing hypotheses:

  1. Null Hypothesis (H₀): The parameter of interest (like a conversion rate) is equal to a specified value, often the status quo or baseline.

    Tags: Ab Testing Data Analysis Data Science Decision Making Statistics

Comment