Data scientists are in the decision-making business. Our work focuses on how to make informed decisions under conditions of uncertainty.
And yet, when it comes to quantifying that uncertainty, we often rely on the idea of “statistical significance,” a tool that, at best, provides a superficial understanding.
In this article, we will explore why “statistical significance” is flawed: arbitrary thresholds, a false sense of certainty, and the failure to address real-world trade-offs.
Most importantly, we will learn how to move beyond the binary mindset of significant versus non-significant and adopt a decision-making framework based on economic impact and risk management.
Imagine that we have just conducted an A/B test to evaluate a new feature designed to increase the time users spend on our website and, as a result, their spend.
The control group consisted of 5,000 users and the treatment group included another 5,000 users. This gives us two arrays, called treatment
and control
each of them contains 5,000 values that represent the spending of individual users in their respective groups.