Large language models (LLMs) have successfully made their way into the challenging areas of artificial intelligence. With their incredible ability to produce unique and creative content with great accuracy and linguistic consistency, LLMs are helping across industries. Large language models are often complemented by reasoning skills and the ability to use different tools. Augmentation basically refers to improving or expanding by adding additional items or features. Augmented LLMs are those that are added with external tools and skills to increase their performance so that they work beyond their inherent capabilities.
Applications like Auto-GPT for autonomous task execution have only been made possible by augmented language models (ALMs). Current ALM attempts are primarily based on the prompting paradigm with interspersed verbal reasoning and calling tools, which have been effective but also impose certain limitations. When connecting with external tools, it requires regular LLM execution and suspension first, causing delays and increasing token usage. Second, LLMs generate tokens based on the previous context and, when stopped for the tool’s response, resume token generation by feeding all historical tokens, resulting in significant fast redundancy at high cost. in terms of token consumption for LLM trading services.
To address the challenges, a team of researchers recently proposed ReWOO (Reasoning Without Observation), a modular paradigm for reducing token consumption. The idea behind ReWOO is to separate the reasoning process of the LLM from external observations, which would help to significantly reduce token consumption. ReWOO minimizes the computational load associated with repeated prompts by separating the reasoning process from external observations.
The key components of an ALM are step-by-step reasoning, tool calls, and summary, which ReWOO breaks down into three separate modules: planner, worker, and solver. The planner breaks down a task and formulates a model of interdependent plans, each of which is assigned to a worker. Worker retrieves external knowledge from tools to provide evidence, and Solver synthesizes all plans and evidence to produce the final answer to the initial task that needs to be completed.
To assess the performance of ReWOO, the team conducted extensive analysis on six open Natural Language Processing (NLP) benchmarks and a curated data set. The results showed consistent improvements with the proposed methodology, with ReWOO achieving a 5x token efficiency gain and a 4% accuracy improvement on the HotpotQA benchmark, which involves multi-step reasoning tasks. ReWOO also proved to be robust in situations where external tools had crash issues.
Decoupling parametric modules from non-parametric tool calls not only increases the efficiency of requests, but also allows fine tuning of instructions in ReWOO. A 175B GPT3.5 parameter can offload its reasoning ability to a smaller language model, 7B LLaMA, through fine tuning, leading to a significant reduction in model parameters, highlighting the possibility of developing Effective and scalable ALM.
Consequently, ReWOO is a promising modular paradigm for ALMs as, for the first time, it overcomes the challenges of redundant prompts and computational complexity.
review the Paper and github link. Don’t forget to join our 22k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]
🚀 Check out 100 AI tools at AI Tools Club
Tanya Malhotra is a final year student at the University of Petroleum and Power Studies, Dehradun, studying BTech in Computer Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.