Artificial intelligence is catching on fast, and for very good reason. With the introduction of large language models such as GPT, BERT, and LLaMA, almost all industries, including healthcare, finance, e-commerce, and media, use these models for tasks such as natural language understanding (NLU), natural language generation. (NLG), answering questions, programming, information retrieval, etc. The very famous ChatGPT, which has been making headlines since its launch, has been built on GPT 3.5 and GPT 4 transformer technology.
These human-mimicking AI systems rely heavily on developing agents that are capable of exhibiting human-like problem-solving abilities. The three main approaches to developing agents that can tackle complex interactive reasoning tasks are: deep reinforcement learning (RL), which involves training agents through a process of trial and error, behavior cloning (BC) through sequence-to-sequence learning (seq2seq) which involves training agents by mimicking the behavior of expert agents and prompting LLMs in which prompting LLM-based generative agents produce reasonable plans and actions for complex tasks.
The RL-based BC and seq2seq approaches have some limitations, such as task decomposition, inability to maintain long-term memory, generalization to unknown tasks, and exception handling. Due to the repeated LLM inference at each time step, the above approaches are also computationally expensive.
Recently, a framework called SWIFTSAGE was proposed to address these challenges and allow agents to mimic how humans solve complex open-world tasks. SWIFTSAGE aims to integrate the strengths of behavioral cloning and fast LLMs to improve task completion performance in complex interactive tasks. The framework is inspired by dual process theory, which suggests that human cognition involves two distinct systems: System 1 and System 2. System 1 involves quick, intuitive, and automatic thinking, while System 2 involves processes of methodical, analytical and deliberate thinking.
The SWIFTSAGE framework consists of two modules: the SWIFT module and the SAGE module. Similar to System 1, the SWIFT module represents quick and intuitive thinking. It is implemented as a compact encoder-decoder language model that has been fitted to the action paths of an Oracle agent. The SWIFT module encodes components of short-term memory such as previous actions, observations, locations visited, and the current state of the environment, followed by decoding the next individual action, with the aim of simulating the rapid and instinctive decision-making process that humans show.
The SAGE module, on the other hand, mimics similar thought processes to System 2 and uses LLM like GPT-4 for sub-goal planning and grounding. In the planning stage, the LLMs are asked to locate the necessary items, plan, track the sub-goals, and detect and rectify potential errors, while in the grounding stage, the LLMs are employed to transform the sub-goals. outputs derived from the planning stage into a sequence of executable actions. .
The SWIFT and SAGE modules have been integrated through a heuristic algorithm that determines when to activate or deactivate the SAGE module and how to combine the outputs of both modules using an action buffer mechanism. Unlike previous methods that generate only the immediate next action, SWIFTSAGE engages in longer-term action planning.
To assess the performance of SWIFTSAGE, experiments were performed on 30 ScienceWorld benchmark tasks. The results have shown that SWIFTSAGE significantly outperforms other existing methods such as SayCan, ReAct and Reflexion. Achieve higher scores and demonstrate superior efficiency in solving complex real-world tasks.
In conclusion, SWIFTSAGE is a promising framework that combines the strengths of behavioral cloning and LLM promotion. Therefore, it can be really beneficial for improving action planning and improving performance on complex reasoning tasks.
review the Paper, github link, and project page. Don’t forget to join our 22k+ ML SubReddit, discord channel, and electronic newsletter, where we share the latest AI research news, exciting AI projects, and more. If you have any questions about the article above or if we missed anything, feel free to email us at [email protected]
🚀 Check out 100 AI tools at AI Tools Club
Tanya Malhotra is a final year student at the University of Petroleum and Power Studies, Dehradun, studying BTech in Computer Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a data science enthusiast with good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.