Image by lifelong superstar in freepik
Generative agents is a term coined by Stanford University and Google researchers in their paper called Generative Agents: Interactive Simulations of Human Behavior (Park et al., 2023). In this article, the research explains that Generative Agents are computational software that credibly simulate human behavior.
In the article, they present how agents could act as humans would: write, cook, speak, vote, sleep, etc., by implementing a generative model, especially the Large Language Model (LLM). Agents can display the ability to make inferences about themselves, other agents, and their environment by leveraging the natural language model.
The researcher builds a system architecture to store, synthesize, and apply relevant memories to generate credible behaviors using a large language model, enabling generative agents. This system consists of three components, they are:
- memory flow. The system records the agent’s experiences and is a reference for the agent’s future actions.
- Reflection. The system synthesizes experience into memories so an agent learns and performs better.
- Planning. The system translates information from the previous system into high-level action plans and allows the agent to react to the environment.
These reflections and planning systems work synergistically with the memory stream to influence the agent’s future behavior.
To simulate the previous system, the researchers focus on creating an interactive society of agents inspired by the game The Sims. The above architecture is connected to ChatGPT and successfully displays 25 agent interactions within its sandbox. The following image shows an example of the agent’s activity throughout the day.
Activity and interaction of the Generative Agent throughout the day (Park et al., 2023)
The researchers have already open sourced all the code for creating Generative Agents and simulating them in the sandbox, which you can find below. repository. The direction is simple enough that you can follow them without much trouble.
Since generative agents are becoming an exciting field, a lot of research is being done based on this. In this article, we will explore several articles on generative agents that you should read. What are these? Let’s get into it.
1. Communicative Agents for Software Development
He Document on communicative agents for software development. (Mandarin et al., 2023) is a new approach to revolutionize software development using generative agents. The premise proposed by the researchers is how the entire software development process could be simplified and unified using natural language communication from Large Language Models (LLM). Tasks include developing code, generating documents, analyzing requirements, and many more.
The researchers point out that generating a complete software using LLM has two main challenges: hallucinations and lack of cross-examination in decision making. To address these issues, researchers propose a chat-based software development framework called ChatDev.
The ChatDev framework follows four phases: design, coding, testing, and documentation. In each phase, ChatDev would establish several agents with various roles, for example, code reviewers, software programmers, etc. To ensure that communication between agents runs smoothly, the researchers developed a chat chain that divided the phases into sequential atomic subtasks. Each subtask would implement collaboration and interaction between agents.
The ChatDev framework is shown in the image below.
The proposed ChatDev framework (Quan et al.2023)
Researchers conduct several experiments to measure the performance of the ChatDev framework in software development. By using gpt3.5-turbo-16kBelow is the performance of the software statistics experiment.
ChatDev Framework Software Statistics (Quan et al.2023)
The above number is a metric of statistical analysis related to the software systems generated by ChatDev. For example, a minimum of 39 lines of code and a maximum of 359 codes are generated. The researchers also showed that 86.66% of the generated software systems worked correctly.
It’s a great article that shows the potential to change the way developers work. Read the document further to understand the complete implementation of ChatDev. The complete code is also available on ChatDev. repository.
2. AgentVerse: Facilitate collaboration between multiple agents and explore emerging behaviors in agents
AgentVerse is a framework proposed in the article Chen et al.., 2023 to simulate groups of agents through the large language model for dynamic problem-solving procedures within the group and adjustment of group members based on progression. This study exists to solve the challenge of static group dynamics where the autonomous agent cannot adapt and evolve in problem solving.
The AgentVerse framework attempts to divide the framework into four steps, including:
- Expert Recruitment: The adjustment phase to get agents aligned on the problem and solution
- Collaborative decision making: Agents discuss to formulate a solution and strategy to solve the problem.
- Action Execution: Agents execute actions in the environment based on the decision.
- Evaluation: The current condition and objectives are evaluated. The feedback reward will return to the first step if the goal still needs to be achieved.
The general structure of AgentVerse is shown in the following image.
AgentVerse Framework (Chen et al.2023)
The researchers experimented with the framework and compared the AgentVerse framework with the single agent solution. The result is presented in the image below.
AgentVerse performance analysis (Chen et al.2023)
The AgentVerse framework can generally outperform individual agents on all presented tasks. This shows that generative agents could perform better than individual agents trying to solve problems. You could try the framework through its repository.
3. AgentSims: An open source sandbox for evaluating large language models
Assessing the capacity of LLMs remains an open question within the community and fields. Three points that limit the ability to adequately assess LLM are task-limited assessment capabilities, vulnerable benchmarks, and non-objective metrics. To handle these problems, Lin et al.2023 proposed a task-based assessment as a benchmark of LLM in their article. It was hoped that this approach would become standard in the assessment of LLM papers, as it could alleviate all the problems raised. To achieve this, the researchers introduce a framework called AgentSims.
AgentSims is a program with interactive and visualization infrastructure for selecting assessment tasks for LLM. The overall goal of AgentSims is to provide researchers and experts with a platform to streamline the task design process and use them as an evaluation tool. The AgentSims interface is presented in the image below.
AgentSims interface (Lin et al.2023)
As the target of AgentSims is everyone who requires LLM assessment in a simpler way, the researchers developed the interface where we can interact with the user interface. You can also try the full demo on your website or access the full code on AgentSims repository.
Generative agents are a recent approach in LLMs to simulate human behaviors. Park’s latest research et al., 2023 has shown great possibility of what Generative Agents could do. That is why many types of research based on Generative Agents have appeared that have opened many new doors.
In this article, we have talked about three different research on generative agents, including:
- Document on communicative agents for software development (Mandarin et al.2023)
- AgentVerse: Facilitate collaboration between multiple agents and explore emerging behaviors in agents (Chen et al.., 2023)
3. AgentSims: An open source sandbox for evaluating large language models (Lin et al.2023)
Cornellius Yudha Wijaya He is an assistant data science manager and data writer. While working full-time at Allianz Indonesia, she loves sharing Python tips and data through social media and print media.