Causal inference has many tangible applications in a wide variety of scenarios, but in my experience, it is a topic that is rarely talked about among data scientists.
In this article, we define causal inference and motivate its use. Then, we apply some basic algorithms in Python to measure the effect of a certain phenomenon.
Causal inference is a field of study interested in measuring the effect of a certain treatment.
Another way to think about causal inference is that it responds And if questions. The goal is always to measure some type of impact given a certain action.
Examples of questions answered with causal inference are:
- What is the impact of running an advertising campaign on product sales?
- What is the effect of a price increase on sales?
- Does this medication make patients heal faster?
We can see that these questions are relevant to decision makers, but they cannot be addressed with traditional machine learning methods.
Causal inference versus traditional machine learning
With traditional machine learning techniques, we generate predictions or forecasts given a set of features.
For example, we can forecast how many sales we would make next month.
In other words, machine learning models discover correlations between features and a target to better predict that target. In that sense, any correlation between some feature and the target is useful if it allows the model to make better predictions.
When it comes to causal inference, we want to measure the impact of a treatment.
For example, we can determine how increasing the price of a product will affect sales.
Thus, with causal inference, we seek to discover causal pathways.