Large language models (LLMs) have demonstrated high performance in natural language processing (NLP) applications. However, they have high computational costs when fine-tuning them, which can lead to the generation of incorrect information, i.e. hallucinations. Two viable strategies have been established to solve these problems: parameter-efficient methods such as low-rank adaptation (LoRA) to minimize computational demands and fact checking to minimize hallucinations.
Verifying the accuracy and reliability of LLM results requires careful fact checking. Fact checking can detect and reduce hallucinations that LLMs can cause by comparing the model-generated text to reliable sources. This procedure is especially crucial in fields such as journalism, law, and healthcare, where accuracy is vital. Models that undergo fact checking are better able to retain their credibility, making them more suitable for use in critical applications.
However, the enormous computational resources required to fine-tune LLMs have historically prevented their widespread use. This has been addressed with LoRA, a fine-tuning strategy that uses parameters efficiently and only modifies a subset of the model parameters rather than the network as a whole. This deliberate modification reduces the processing load and allows for more effective adaptability of LLM tasks without compromising performance.
Although LoRA has proven effective in mitigating computational burden, researchers have investigated the feasibility of simultaneously merging multiple LoRAs to handle disparate tasks or viewpoints. Most research has focused on parallel integration of these LoRAs, such as in the LoraHub technique, which computes the weighted sum of many LoRAs in parallel. Despite its effectiveness, this strategy might only partially exploit the distinct advantages of each specific LoRA, potentially resulting in sub-optimal performance.
To overcome this limitation, current work has shifted its focus from merely integrating disparate LoRAs in parallel to creating links between them. The goal is to facilitate knowledge sharing and mutual learning between different LoRAs, each fine-tuned at specific reasoning tasks. Implementing an integrated approach has the potential to increase the LLM’s capability for complicated tasks such as fact checking by fostering a more holistic reasoning aptitude.
Within this framework, the research presents three reasoning datasets specially created for tasks involving fact checking. Each dataset is used to fine-tune individual LoRAs, allowing them to craft different types of arguments. Then, using a unique strategy known as LoraMap, these specialized LoRAs are strategically placed and linked. To facilitate communication and enhance their collective thinking ability, LoraMap aims to map and connect the numerous LoRAs.
The team has summarized its main contributions as follows.
- Three specialized reasoning datasets have been created especially for data verification tasks. Each dataset is used to fine-tune independent Low-Rank Adaptations (LoRAs), allowing them to infer information from different perspectives.
- The team has studied ways to link logical LoRAs and has come up with a new strategy known as LoraMap. Inspired by the way the brain processes information in neuroscience, LoraMap discovers relationships between LoRAs rather than simply linking them together linearly.
- When evaluating LoraMap on the COVID-Fact dataset, it showed superior performance compared to current approaches such as LoraHub. It outperformed LoraConcat with a significantly smaller number of parameters, demonstrating its effectiveness and efficiency in optimizing LLMs for complex reasoning tasks.
In conclusion, improving computational efficiency with methods like LoRA and reducing hallucinations through fact checking are fundamental advances for LLMs. LoraMap provides a more sophisticated and efficient method to optimize LLMs for intricate reasoning tasks, going beyond parallel integration and emphasizing the relationships between multiple LoRAs.
Take a look at the Paper. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and LinkedInJoin our Telegram Channel. If you like our work, you will love our fact sheet..
Don't forget to join our SubReddit of over 50,000 ml
Tanya Malhotra is a final year student of the University of Petroleum and Energy Studies, Dehradun, pursuing BTech in Computer Engineering with specialization in artificial intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking skills, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>