Can we align LLMs with honesty by adjusting instruction? Addressing Hallucinations in Large Language Models with Rejection-Aware Adjustment of Instructions

Researchers from the Hong Kong University of Science and technology and the University of Illinois Urbana-Champaign have collaborated to address a challenge faced by large language models (LLMs) known as hallucinations, where these models generate non-existent facts, by introducing a novel approach. called Rejection-Aware Instruction Tuning (R-Tuning). Observation of existing instruction fitting methods reveals that often in LLM, models are forced to complete sentences even when a knowledge gap exists, leading to the generation of inaccurate information.

The core idea of R-tuning involves recognizing the knowledge gap between LLMs' parametric knowledge and instruction tuning data and then building a rejection-aware dataset by identifying uncertain questions and training the model to explicitly reject them. answer questions beyond your parametric knowledge. This two-step process involves measuring the knowledge gap by comparing model predictions with ground-truth labels and constructing rejection-aware data by adding uncertainty expressions to uncertain questions.

The researchers conducted single-task and multi-task experiments on seven datasets, namely ParaRel, HotpotQA, SelfAware, HaluEval, FalseQA, NEC, MMLU, WiCE, and FEVER. In single-task experiments, R-Tuning demonstrated a remarkable ability to reject uncertain questions, improving accuracy on questions within the model's knowledge. In multi-task experiments, R-Tuning showed its rejection capability as a meta-skill, providing advantages within and across domain data sets.

Comparisons with reference models, including Pretrain-T, Pretrain-W, and Vanilla fine-tuning, revealed that R-Tuning consistently outperformed in average accuracy (AP) scores. The results indicated that R-Tuning effectively reduced hallucinations by filtering questions beyond the model's knowledge domain. Additionally, the study explored the impact of model size on rejection capability, showing that larger models demonstrated better scalability and performance.

Surprisingly, the researchers found that learning the uncertainty during training and incorporating it into the model training process produced better results than directly applying uncertainty filtering on the test data. This unexpected finding suggested that uncertainty learning improved model training to estimate uncertainty and answer questions, highlighting the advantages of incorporating uncertainty learning into LLM training. They also discovered unsupervised identification strategies and tag replacement methods within R-Tuning, demonstrating that uncertainty-based identification and direct tag replacement were effective approaches.

Additionally, R-Tuning successfully addressed unanswered questions, refusing to provide answers to queries that contradicted common sense or were beyond the model's knowledge. The in-depth analysis included examining the perplexity of rejected questions and the entropy of answers, providing insights into how R-Tuning improved the model's ability to handle different levels of randomness and difficulty of questions.

In conclusion, the researchers introduced R-Tuning as a powerful method to teach LLMs to reject unknown questions, address the challenge of hallucinations, and improve model accuracy. The rejection ability demonstrated by R-Tuning was identified as a meta-skill that could generalize across multiple tasks, demonstrating its potential impact on the reliability and performance of large language models.

Review the Paper. All credit for this research goes to the researchers of this project. Also, don't forget to join. our SubReddit of more than 35,000 ml, 41k+ Facebook community, Discord Channel, LinkedIn Grabove, Twitterand Electronic newsletterwhere we share the latest news on ai research, interesting ai projects and more.

If you like our work, you'll love our newsletter.

Pragati Jhunjhunwala is a Consulting Intern at MarktechPost. She is currently pursuing B.tech from the Indian Institute of technology (IIT), Kharagpur. She is a technology enthusiast and has a keen interest in the scope of data science software and applications. She is always reading about the advancements in different fields of ai and ML.

<!– ai CONTENT END 2 –>

ai/?via=aitoolsclub”> Get stunning professional photos effortlessly with Aragon. TRY IT NOW!

Can we align LLMs with honesty by adjusting instruction? Addressing Hallucinations in Large Language Models with Rejection-Aware Adjustment of Instructions

Technical Terrence Team

Here's why Taylor Wimpey's share price is up 42% in 2023!

Leave a Reply Cancel reply

Recommended.

Cryptos screams like scramjet in broad-based rally

Southwest Airlines recovers its free flight offer

Traders Must Watch Bitcoin and US M2 Supply Closely to Win

Black Investors Teach You About Bitcoin – Bitcoin Magazine

Supporting sustainability, digital health, and the future of work | MIT News

Categories

Important Links

Can we align LLMs with honesty by adjusting instruction? Addressing Hallucinations in Large Language Models with Rejection-Aware Adjustment of Instructions

Related

Technical Terrence Team

Here's why Taylor Wimpey's share price is up 42% in 2023!

Leave a Reply Cancel reply

Recommended.

Cryptos screams like scramjet in broad-based rally

Southwest Airlines recovers its free flight offer

Traders Must Watch Bitcoin and US M2 Supply Closely to Win

Black Investors Teach You About Bitcoin – Bitcoin Magazine

Supporting sustainability, digital health, and the future of work | MIT News

Categories

Important Links

Get daily news updates to your inbox!