Powerful machine learning models are being used to help people tackle difficult problems, such as disease identification in medical imaging or road obstacle detection for autonomous vehicles. But machine learning models can make mistakes, so in high-risk environments it’s critical that humans know when to trust a model’s predictions.
Uncertainty quantification is a tool that improves the reliability of a model; the model produces a score along with the prediction that expresses a level of confidence that the prediction is correct. While uncertainty quantification can be useful, existing methods typically require retraining the entire model to give it that capability. Training consists of showing a model millions of examples so that it can learn a task. So retraining requires millions of inputs of new data, which can be expensive and difficult to obtain, and also consume huge amounts of computing resources.
Researchers at MIT and the MIT-IBM Watson AI Lab have now developed a technique that enables a model to perform more effective uncertainty quantification, using far fewer computing resources than other methods and without additional data. His technique, which does not require the user to retrain or modify a model, is flexible enough for many applications.
The technique involves creating a simpler complementary model that helps the original machine learning model to estimate uncertainty. This smaller model is designed to identify different types of uncertainty, which can help researchers dig deeper into the root cause of inaccurate predictions.
“The quantification of uncertainty is essential for both developers and users of machine learning models. Developers can use uncertainty measures to help develop more robust models, while for users, it can add another layer of trust and reliability when deploying models in the real world. Our work leads to a more flexible and practical solution for uncertainty quantification,” says Maohao Shen, a graduate student in electrical and computer engineering and lead author of a paper. paper about this technique.
Shen wrote the paper with Yuheng Bu, a former postdoc at the Research Laboratory of Electronics (RLE) who is now an assistant professor at the University of Florida; Prasanna Sattigeri, Soumya Ghosh, and Subhro Das, research staff members at the MIT-IBM Watson AI Lab; and lead author Gregory Wornell, a Sumitomo Professor of Engineering who directs the RLE Signals, Information, and Algorithms Laboratory and is a member of the MIT-IBM Watson AI Lab. The research will be presented at the AAAI Conference on Artificial Intelligence.
Quantification of uncertainty
In quantifying uncertainty, a machine learning model generates a numeric score with each output to reflect its confidence in the accuracy of that prediction. Incorporating uncertainty quantification by creating a new model from scratch or retraining an existing model typically requires a large amount of data and expensive computation, which is often not practical. Furthermore, existing methods sometimes have the unintended consequence of degrading the quality of the model’s predictions.
Researchers at MIT and the MIT-IBM Watson AI Lab have focused on the following problem: given a pretrained model, how can they enable it to perform effective uncertainty quantification?
They solve this by creating a smaller, simpler model, known as a metamodel, which is attached to the larger pretrained model and uses the features the larger model has already learned to help it perform quantification uncertainty assessments.
“The metamodel can be applied to any previously trained model. It’s better to have access to the internal parts of the model, because we can get a lot more information about the base model, but it will also work if you only have one final result. You can still predict a confidence score,” Sattigeri says.
They design the metamodel to produce the result of quantification uncertainty using a technique that includes both types of uncertainty: data uncertainty and model uncertainty. Data uncertainty is caused by corrupted data or inaccurate labels and can only be reduced by repairing the data set or collecting new data. In model uncertainty, the model is not sure how to explain the newly observed data and may make incorrect predictions, most likely because it has not seen enough similar training examples. This problem is an especially challenging but common problem when deploying models. In real world environments, they often encounter data that is different from the training data set.
“Has the reliability of your decisions changed when you use the model in a new setting? You want some way to have confidence in whether you’re running in this new regimen or whether you need to collect training data for this particular new setup,” says Wornell.
Validation of quantification
Once a model produces an uncertainty quantification score, the user still needs some reassurance that the score itself is accurate. Researchers often validate accuracy by creating a smaller data set, drawn from the original training data, and then testing the model on the holdout data. However, this technique does not work well for measuring quantification uncertainty because the model can achieve good prediction accuracy while still being overconfident, Shen says.
They created a new validation technique by adding noise to the data in the validation set; these noisy data are more like out-of-distribution data that can cause uncertainty in the model. Researchers use this noisy data set to evaluate uncertainty quantifications.
They tested their approach by seeing how well a metamodel could capture different types of uncertainty for various downstream tasks, including out-of-distribution detection and misclassification detection. His method not only exceeded all baselines on every subsequent task, but also required less training time to achieve those results.
This technique could help researchers enable more machine learning models to effectively perform uncertainty quantification, ultimately helping users make better decisions about when to trust predictions.
In the future, the researchers want to adapt their technique for new classes of models, such as large language models that have a different structure than a traditional neural network, Shen says.
The work was funded, in part, by the MIT-IBM Watson AI Lab and the US National Science Foundation.