Recent developments have seen a notable increase in the capability of Large Language Models (LLMs), with Generative Pretrained Transformer (GPT) models showing significant promise. The transition from GPT-3 to GPT-4, as well as the emergence of other LLMs such as PaLM and LLaMA, demonstrated considerable improvement in problem solving skills and natural language understanding. Additionally, generative models are frequently used in a variety of industries to generate data for different applications. When LLMs are used in applications that need a high level of accuracy and reliability, such as healthcare and biological areas, the problem of hallucinations remains a major barrier.
Unfortunately, there are no systematic techniques available to accurately detect hallucinations or measure the confidence level of the output. Particularly after using reinforcement learning with human input, the intrinsic confidence score of generative LLMs is sometimes unavailable or not effectively calibrated against the intended target. Heuristic techniques are expensive to compute and are subject to bias from the LLM itself, such as sampling from a set of LLM responses. There are two basic categories of methods for assessing the degree of confidence in LLM responses. In the first, the LLM is pressured in various ways to create many answers, which are then used to infer the reliability of the answer.
Self-consistency and prompting a chain of thought are two examples. These techniques are less quantitative and susceptible to model-induced bias in the estimated confidence. There is no standardized way to measure this, but the indication technique can have a significant impact on the quality of the results. The second category of options uses external sources of data, such as hiring human reviewers to verify the response or using large amounts of labeled data to create evaluation models. One of the main obstacles to current supervised model training is the extensive manual annotation work that these techniques require. In that sense, self-monitoring offers a viable option, as you can adaptively use data patterns and innovative expertise.
The Microsoft researchers in this study provide a flexible framework that uses Pareto optimal learning to blend data from LLM monitoring and response sources. They were motivated by previous efforts in programmatic supervision and the wealth of Pareto optimization research. The following insights guide your strategy. To avoid the bias of a self-judging LLM, external sources of supervision that are independent of the LLM are required. Second, think of LLM errors as noisy disturbances on the gold labels. When a model is equipped with LLM noise and independent external noise, implicit label smoothing is actually performed, improving calibration power.
In that sense, Pareto optimal self-monitoring provides a useful framework for integrating both qualities. In particular, the suggested method only needs unlabeled data, making it appropriate for fields where annotation is expensive. His unique approach to LLM calibration using Pareto optimal self-monitoring is the key innovation of the article. They suggest using the Pareto Optimum Learning (POLAR) assessed risk score to calculate the probability of LLM errors. They present experimental findings on four different NLP tasks and demonstrate that the suggested POLAR score is substantially related to the LLM error rate assessed on the gold labels. They show improved LLM performance for high-risk situations as determined by the POLAR score using dynamic cueing strategies. Without using any human-labeled training data, they demonstrate how their method can eliminate LLM errors and improve GPT-4 benchmark performance to outperform the most advanced supervised model.
review the Paper. Don’t forget to join our 25k+ ML SubReddit, discord channel, Twitterand electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]
Featured Tools:
🚀 Check out 100 AI tools at AI Tools Club
Aneesh Tickoo is a consulting intern at MarktechPost. She is currently pursuing her bachelor’s degree in Information Science and Artificial Intelligence at the Indian Institute of Technology (IIT), Bhilai. She spends most of her time working on projects aimed at harnessing the power of machine learning. Her research interest is image processing and she is passionate about creating solutions around her. She loves connecting with people and collaborating on interesting projects.