Large Language Model (LLM) agents, capable of performing a wide range of actions such as invoking tools and controlling robots, show great potential for addressing real-world challenges. LLM agents are typically asked to produce actions by outputting JSON or text in a predefined format, which is often limited by a restricted action space (e.g., the scope of predefined tools) and restricted flexibility (e.g., the inability to compose multiple tools). This work proposes to use executable Python code to consolidate LLM agents' actions into a unified action space (CodeAct). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise previous actions or issue new actions upon new observations through multi-turn interactions. Our extensive analysis of 17 LLMs on API-Bank and a new selected benchmark show that CodeAct outperforms widely used alternatives (up to 20% higher success rate). The encouraging performance of CodeAct motivates us to build an open-source LLM agent that interacts with environments by executing interpretable code and collaborates with users using natural language. To this end, we collected a CodeActInstruct instruction tuning dataset consisting of 7k multi-turn interactions using CodeAct. We demonstrate that it can be used with existing data to improve models on agent-oriented tasks without compromising their overall capability. CodeActAgent, optimized from Llama2 and Mistral, is integrated with the Python interpreter and uniquely designed to perform sophisticated tasks (e.g., model training) using existing libraries and autonomous self-debugging.