Why are my test visits not being split evenly?
When testing using AB Split Test, the variation that is seen is decided on the website visitors machine using JavaScript. This ensures that the split testing happens as quickly as possible, but can create skewed visits at the start.
TL;DR this is normal, and AB Split Test uses Bayesian statistics to calculate a winner for you.
Explanation:
When JavaScript (or any programming language) generates random outcomes, such as a 50/50 guess, the results can often appear skewed in the short term before they approach a normalized distribution of 50/50 over a large number of trials. Here’s why:
1. Small Sample Size
- Random processes tend to produce significant variations from the expected probability in small sample sizes. For instance, if you flip a coin 10 times, it’s not unusual to see 7 heads and 3 tails, even though the theoretical probability is 50/50.
- As the number of trials increases, the law of large numbers states that the observed proportion of outcomes will converge to the expected probability (50/50 in this case).
2. Random Fluctuations
- In the early stages of random number generation, outcomes are heavily influenced by initial fluctuations. If you start with 3 heads and 1 tail, your percentage will be skewed towards heads until many more flips occur, gradually reducing the skew.
3. Normalization Over Time
- Over a large number of trials, the deviations average out, and the outcomes tend to normalize due to the law of large numbers. For instance:
- After 10 trials: You might see 7 heads and 3 tails (70% heads).
- After 100 trials: You might see 52 heads and 48 tails (52% heads).
4 How Bayesian Statistics Helps
Bayesian statistics doesn’t focus solely on the raw observed rates (e.g., conversion percentages) but instead uses probability distributions to model and interpret the uncertainty in the results.
Key Concepts in Bayesian Analysis:
- Prior Beliefs (Priors):
- Bayesian stats start with a “prior belief” about the likely performance of the variations. For example, you might assume that A and B are equally likely to perform well before the test begins.
- Priors smooth the analysis, especially early in the test when data is sparse, reducing the impact of initial skews.
- Likelihood (Observed Data):
- The actual test results (e.g., the number of conversions for each variation) are used to update the prior beliefs.
- Posterior Distributions:
- Bayesian analysis combines the priors and observed data to calculate a posterior distribution, representing the updated belief about the performance of each variation.
- Instead of a single point estimate (e.g., “Version A has a 40% conversion rate”), the posterior distribution shows a range of probable conversion rates for each variation.
Bayesian Inference in Action:
- Probabilities of Winning:
- Bayesian analysis answers questions like, “What is the probability that Version B is better than Version A?” rather than focusing on whether the observed difference is statistically significant.
- Early in the test, both variations might have overlapping probability distributions, indicating that the evidence is weak. As data accumulates, the distributions narrow and provide more confident predictions.
5. Why Skew Doesn’t Matter in Bayesian Stats
- Incorporation of Uncertainty:
- Bayesian stats naturally incorporate uncertainty into the model. If the data is skewed early on, the wide posterior distributions reflect that uncertainty, preventing overconfidence in the results.
- Dynamic Updates:
- As more data is collected, the model continuously updates, and the skew is automatically adjusted as the posterior distributions converge to the true performance metrics.
- Focus on Probabilities:
- Instead of relying on raw conversion rates, Bayesian stats let you focus on the probability that one variation is better than the other, making short-term randomness irrelevant.