CodeAct: Your LLM agent performs better when generating code

Large Language Model (LLM) agents, capable of performing a wide range of actions such as invoking tools and controlling robots, show great potential for addressing real-world challenges. LLM agents are typically asked to produce actions by outputting JSON or text in a predefined format, which is often limited by a restricted action space (e.g., the scope of predefined tools) and restricted flexibility (e.g., the inability to compose multiple tools). This work proposes to use executable Python code to consolidate LLM agents' actions into a unified action space (CodeAct). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise previous actions or issue new actions upon new observations through multi-turn interactions. Our extensive analysis of 17 LLMs on API-Bank and a new selected benchmark show that CodeAct outperforms widely used alternatives (up to 20% higher success rate). The encouraging performance of CodeAct motivates us to build an open-source LLM agent that interacts with environments by executing interpretable code and collaborates with users using natural language. To this end, we collected a CodeActInstruct instruction tuning dataset consisting of 7k multi-turn interactions using CodeAct. We demonstrate that it can be used with existing data to improve models on agent-oriented tasks without compromising their overall capability. CodeActAgent, optimized from Llama2 and Mistral, is integrated with the Python interpreter and uniquely designed to perform sophisticated tasks (e.g., model training) using existing libraries and autonomous self-debugging.

CodeAct: Your LLM agent performs better when generating code

Technical Terrence Team

This is why the airline's shares fell by more than 15%

Leave a Reply Cancel reply

Recommended.

Researchers from Tsinghua University propose SPMamba: a novel artificial intelligence architecture based on state-space models to improve audio clarity in multi-speaker environments

Chase follows Starbucks and McDonald’s in making a tough decision

Stellantis focused on ethanol hybrid vehicles in South America, says executive By Reuters

Cheryl Hines' husband chooses Sergey Brin's ex-wife as his running mate

Ethereum Stake Drops Ahead of Shapella Update

Categories

Important Links

CodeAct: Your LLM agent performs better when generating code

Related

Technical Terrence Team

This is why the airline's shares fell by more than 15%

Leave a Reply Cancel reply

Recommended.

Researchers from Tsinghua University propose SPMamba: a novel artificial intelligence architecture based on state-space models to improve audio clarity in multi-speaker environments

Chase follows Starbucks and McDonald’s in making a tough decision

Stellantis focused on ethanol hybrid vehicles in South America, says executive By Reuters

Cheryl Hines' husband chooses Sergey Brin's ex-wife as his running mate

Ethereum Stake Drops Ahead of Shapella Update

Categories

Important Links

Get daily news updates to your inbox!