Understanding explainable AI and interpretable AI

As a result of recent technological advances in machine learning (ML), ML models are now used in a variety of fields to improve performance and eliminate the need for human labor. These disciplines can be as simple as helping authors and poets refine their writing style, or as complex as protein structure prediction. Also, there is very little tolerance for error as ML models gain popularity in a number of crucial industries such as medical diagnostics, credit card fraud detection, etc. As a result, it becomes necessary for humans to understand these algorithms and how they work at a deeper level. After all, for scholars to design even more robust models and fix the flaws in current models regarding bias and other concerns, gaining a better understanding of how ML models make predictions is crucial.

This is where Interpretable Artificial Intelligence (IAI) and Explainable Artificial Intelligence (XAI) techniques come into play, and the need to understand their differences becomes more evident. Although the distinction between the two is not always clear cut, even to academics, the terms interpretability and explainability are sometimes used interchangeably when referring to ML approaches. It is crucial to distinguish between the IAI and XAI models due to their increasing popularity in the ML field to help organizations select the best strategy for their use case.

Simply put, humans can easily understand interpretable AI models just by looking at their model summaries and parameters without the aid of additional tools or approaches. In other words, it’s safe to say that an IAI model provides its own explanation. On the other hand, explainable AI models are very complicated deep learning models that are too complex for humans to understand without the help of additional methods. This is why explainable AI models can give a clear idea of why a decision was made, but not how that decision was reached. In the rest of the article, we will delve into the concepts of interpretability and explainability and understand them with the help of examples.

1. Interpretable machine learning

We argue that anything can be interpretable if its meaning can be discerned, that is, its cause and effect can be clearly determined. For example, if someone consumes too many chocolates immediately after dinner, he will always have trouble sleeping. Situations of this nature can be interpreted. A model is said to be interpretable in the ML domain if people can understand it for themselves based on its parameters. With interpretable AI models, humans can easily understand how the model arrived at a particular solution, but not whether the criteria used to arrive at that result are sensible. Decision trees and linear regression are a couple of examples of interpretable models. Let us better illustrate the interpretability with the help of an example:

Consider a bank that uses a trained decision tree model to determine whether to approve a loan application. The applicant’s age, monthly income, whether they have any other outstanding loans, and other variables are taken into account when making a decision. To understand why a particular decision was made, we can easily walk through the nodes of the tree and, based on the decision criteria, we can understand why the final result was what it was. For example, a decision criteria might specify that a loan application will not be approved if a non-student has a monthly income of less than $3,000. However, we cannot understand the logic behind the choice of decision criteria by using these models. For example, the model does not explain why a minimum income requirement of $3,000 applies for a non-student applicant in this scenario.

To produce the provided result, different factors need to be interpreted, including weights, features, etc., for organizations that want to better understand why and how their models generate predictions. But this is possible only when the models are quite simple. Both the linear regression model and the decision tree have a small number of parameters. As models become more complicated, we can no longer understand them in this way.

2. Explainable Machine Learning

Explainable AI models are those whose inner workings are too complex for humans to understand how they affect the final prediction. Black-box models, in which model features are considered the input and the ultimately produced predictions are the output, are another name for ML algorithms. Humans require additional methods to observe these “black box” systems in order to understand how these models work. An example of such a model would be a random forest classifier consisting of many decision trees. In this model, the predictions from each tree are considered when determining the final prediction. This complexity only increases when models based on neural networks such as LogoNet are taken into account. With an increase in the complexity of such models, it becomes simply impossible for humans to understand the model simply by looking at the model weights.

As mentioned above, humans need additional methods to understand how sophisticated algorithms generate predictions. Researchers use different methods to find connections between the input data and the predictions generated by the model, which can be helpful in understanding how the ML model behaves. Such model-independent methods (methods that are independent of model type) include partial dependency plots, SHpley Additive Explanations (SHAP) dependency plots, and surrogate models. Various approaches are also employed that emphasize the importance of different features. These strategies determine how well each attribute can be used to predict the target variable. A higher score means that the feature is more crucial to the model and has a significant impact on the prediction.

However, the question that remains is why it is necessary to distinguish between interpretability and explainability of ML models. It is clear from the arguments mentioned above that some models are easier to interpret than others. In simple terms, one model is more interpretable than another if it is easier for a human to understand how it makes predictions than the other model. It is also the case that, in general, less complicated models are more interpretable and often have lower precision than more complex models involving neural networks. Therefore, high interpretability often comes at the cost of lower precision. For example, using logistic regression to perform image recognition would yield inferior results. On the other hand, the explainability of the model starts to play a bigger role if a company wants to achieve high performance but still needs to understand the behavior of the model.

Therefore, companies should consider whether interpretability is required before starting a new ML project. When the data sets are large and the data is in the form of images or text, neural networks can meet the client’s goal with high performance. In such cases, when complex methods are needed to maximize performance, data scientists place more emphasis on model explainability than interpretability. Because of this, it is crucial to understand the distinctions between model explainability and interpretability and to know when to favor one over the other.

Don’t forget to join our 15k+ ML SubReddit, discord channeland electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.

Khushboo Gupta is a consulting intern at MarktechPost. He is currently pursuing his B.Tech at the Indian Institute of Technology (IIT), Goa. She is passionate about the fields of machine learning, natural language processing, and web development. She likes to learn more about the technical field by participating in various challenges.