Finally, we have reached the fifth article of the series “Agentic ai Design Patterns.” Today, we will discuss the 4th pattern: the Agentic ai Multi-Agent Pattern. Before digging into it, let’s refresh our knowledge of the first three patterns – The Reflection Pattern, Tool Use Pattern, and Planning Pattern. These design patterns represent essential frameworks in developing ai systems that can exhibit more sophisticated and human-like agentic behaviour.
Reiterating what we have learned till now!
In the reflection pattern, we saw how agents do the iterative process of generation and self-assessment to improve the final output. Here, the agent acts as a generator critic and improves the output. On the other hand, the Tool use pattern talks about how the agent boosts its capabilities by interacting with external tools and resources to provide the best output for the user query. It is beneficial for complex queries where more than internal knowledge is needed. In the Planning pattern, we saw how the agent breaks down the complex task into smaller steps and acts strategically to produce the output. Also, in the Planning pattern – ReAct (Reasoning and Acting) and ReWOO (Reasoning With Open Ontology) augment the decision-making and contextual reasoning.
Here are the three patterns:
Now, talking about the Agentic ai Multi-Agent design pattern – In this pattern, you can divide a complex task into subtasks, and different agents can perform these tasks. For instance, if you are building software, then the tasks of coding, planning, product management, designing and QA will be done by the different agents proficient in their respective tasks. Sounds intriguing, right? Let’s build this together!!!
<h2 class="wp-block-heading" id="h-the-architecture-of-agentic-ai-multi-agent-pattern”>The Architecture of Agentic ai Multi-Agent Pattern
This architecture showcases an Agentic ai multi-agent system in which various agents with specialized roles interact with each other and with an overarching multi-agent application to process a user prompt and generate a response. Each agent in the system has a unique function, simulating a collaborative team working together to achieve a task efficiently.
Components Explained:
- User Interaction:
- Prompt: The user initiates the interaction by inputting a prompt into the multi-agent application.
- Response: The system processes the prompt through collaborative agent interactions and returns a response to the user.
- Agents and Their Roles:
- Agent 1: Software Engineer: Focuses on technical problem-solving related to software development, providing coding solutions, or suggesting software-based strategies.
- Agent 2: Project Manager: Oversees the project management aspect, coordinating efforts among agents and ensuring the process aligns with overall project goals.
- Agent 3: Content Developer: Generates content, writes drafts, or assists in developing documentation and creative materials needed for the project.
- Agent 4: Market Research Analyst: Gathers data, conducts analysis on market trends, and provides insights that inform other agents’ strategies.
- Interaction Flow:
- The arrows between agents signify communication channels and collaboration paths. This implies that:
- Bidirectional Arrows (double-headed): Agents can exchange information back and forth, enabling iterative collaboration.
- Dashed Lines: Indicate secondary or indirect communication paths between agents, suggesting a support role in the communication flow rather than primary coordination.
- The arrows between agents signify communication channels and collaboration paths. This implies that:
- Communication Workflow:
- Initiation: The user provides a prompt to the multi-agent system.
- Coordination:
- Agent 1 (Software Engineer) may start by determining any initial technical requirements or strategies.
- Agent 2 (Project Manager) coordinates with Agent 1 and other agents, ensuring everyone is aligned.
- Agent 3 (Content Developer) creates relevant content or drafts that may be needed as part of the output.
- Agent 4 (Market Research Analyst) supplies research data that could be essential for informed decision-making by the other agents.
- Completion: Once all agents have collaborated, the system compiles the final response and presents it to the user.
Key Characteristics:
- Collaborative Intelligence: This architecture promotes collaborative problem-solving, where agents with specialized expertise contribute distinct insights and skills.
- Autonomy: Each agent operates semi-independently, focusing on their specific roles while maintaining communication with other agents.
- Scalability: The model can be expanded by adding more specialized agents to address more complex user prompts.
This architecture is particularly effective in multifaceted tasks that require diverse expertise, such as research projects, product development, and comprehensive content creation. The emphasis on distinct roles and coordinated communication ensures that each part of a complex task is handled efficiently and cohesively. I hope you have understood how Multi-Agent works. Now, we will talk about a framework to build Multi-Agent solutions.
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Do you know many frameworks such as CrewAI, LangGraph, and AutoGen that provide ways for developers to build multi-agent solutions? Today, we are talking about the AutoGen:
AutoGen introduces a new paradigm in LLM applications by enabling customisable and conversable agents designed to function within multi-agent conversation frameworks. This design is rooted in the understanding that modern LLMs can adapt and integrate feedback seamlessly, particularly those optimised for dialogue (e.g., GPT-4). AutoGen leverages this capability by allowing agents to interact conversationally—exchanging observations, critiques, and validations, either autonomously or with human oversight.
The versatility of AutoGen agents stems from their ability to incorporate various roles and behaviours tailored to the developer’s needs. For instance, these agents can be programmed to write or execute code, integrate human feedback, or validate outcomes. This flexibility is supported by a modular structure that developers can easily configure. Each agent’s backend is extendable, allowing further customisation and enhancing its functionality beyond default settings. The agents’ conversable nature enables them to hold sustained multi-turn dialogues and adapt to dynamic interaction patterns, making them suitable for diverse applications from question-answering and decision-making to complex problem-solving tasks.
Conversation Programming
A pivotal innovation within AutoGen is the concept of conversation programming, which revolutionises LLM application development by streamlining the process into multi-agent conversations. This programming paradigm shifts the focus from traditional code-centric workflows to conversation-centric computations, allowing developers to manage complex interactions more intuitively. Conversation programming unfolds in two core steps:
- Defining Conversable Agents: Developers create agents with specific capabilities and roles by configuring built-in features. These agents can be set to operate autonomously, collaborate with other agents, or involve human participation at different points, ensuring a balance between automation and user control.
- Programming Interaction Behaviors: Developers program how these agents interact through conversation-centric logic. This involves using a blend of natural language and code, enabling flexible scripting of conversation patterns. AutoGen facilitates seamless implementation of these interactions with ready-to-use components that can be extended or modified for experimental or tailored applications.
The integration of conversation programming supports the modular combination of different LLM capabilities, enabling the division of complex tasks into manageable subtasks that agents can collaboratively solve. This framework underpins the development of robust and scalable LLM applications across multiple fields, including research, coding, and interactive entertainment.
How to Use AutoGen to Program a Multi-agent Conversation?
There are three main sections: AutoGen Agents, Developer Code, and Program Execution, illustrating how to use AutoGen to program a multi-agent conversation. Here is a detailed breakdown:
1. AutoGen Agents
- ConversableAgent: This is the overarching framework within which different types of agents operate. The diagram highlights several agent types:
- AssistantAgent: Configurable with options such as human_input_mode set to “NEVER” and code_execution_config set to False. This means the agent is fully autonomous and does not rely on human input during its operation.
- UserProxyAgent: Set with human_input_mode as “ALWAYS,” indicating that it is user-controlled and will always require human input to respond.
- GroupChatManager: Manages interactions between multiple agents in a group conversation.
- Unified Conversation Interfaces: All agents share interfaces for sending, receiving, and generating replies.
2. Developer Code
This section demonstrates the steps to set up and customize the interaction between agents.
- Define Agents:
- Two agents, User Proxy A and Assistant B, are defined. They can communicate with each other, forming the basis of a multi-agent conversation.
- Register a Custom Reply Function:
- A custom reply function (reply_func_A2B) is registered for one agent (Agent B). This function outlines how Agent B generates replies when invoked.
The function includes a simple logic structure:
def reply_func_A2B(msg):
output = input_from_human()
if not output:
if msg includes code:
output = execute(msg)
return output
- This function allows Agent B to either get input from a human or execute code if the input message includes executable commands.
- Initiate Conversations: A sample initiation line is shown:
initiate_chat("Plot a chart of META and TESLA stock price change YTD.”)
This line sets Agent A to initiate a conversation with Agent B, asking it to plot a chart based on the given command.
3. Program Execution
This section details how the conversation proceeds after initialisation.
- Conversation-Driven Control Flow:
- The interaction starts with Agent A sending a request to Agent B.
- Agent B then receives the request and invokes the generate_reply function, which may trigger code execution if required.
- Conversation-Centric Computation:
- The flow shows how messages are passed between generate_reply and the agents:
- For example, after attempting to execute the command, an error message is sent back if a required package is missing (e.g., Error: package yfinance is not installed).
- The reply then informs the user to install the missing package (“Sorry! Please first pip install yfinance and then execute”).
- The flow shows how messages are passed between generate_reply and the agents:
In a nutshell, it visualises how to program a conversation-driven interaction between agents using AutoGen. The process involves defining agents, customising their behaviours through reply functions, and handling conversation control flow, including executing code and responding to user requests.
AutoGen Agents, Developer Code, and Program Execution, are designed to guide a developer through setting up an automated multi-agent interaction, from defining and customising agents to observing the control flow of conversation and execution.
<h2 class="wp-block-heading" id="h-hands-on-agentic-ai-multi-agent-pattern”>Hands-on Agentic ai Multi-Agent Pattern
Here we will talk about Agentic ai multi-agent conversation (This is inspired by <a target="_blank" href="https://www.deeplearning.ai/” target=”_blank” rel=”noreferrer noopener nofollow”>Deeplearning.ai). I am using AutoGen, which has a built-in agent class called “Conversable agent.”
Let’s begin with the Setup.
!pip install openai
# python==3.10.13
!pip install pyautogen==0.2.25
import os
os.environ('OPENAI_API_KEY')='Your_API_Key'
llm_config = {"model": "gpt-4o"}
The configuration specifies the model to be used (gpt-4o).
Define an AutoGen agent
The ConversableAgent class creates a chatbot agent. The human_input_mode=”NEVER” indicates that the agent won’t request manual user input during conversations.
from autogen import ConversableAgent
agent = ConversableAgent(
name="chatbot",
llm_config=llm_config,
human_input_mode="NEVER",
)
reply = agent.generate_reply(
messages=({"content": "You are renowned ai expert. Now Tell me 2 jokes on ai .", "role": "user"})
)
print(reply)
reply = agent.generate_reply(
messages=({"content": "Repeat the joke.", "role": "user"})
)
print(reply)
Output
Certainly! Could you please tell me which joke you'd like me to repeat?
Setting up the Conversation
Setting up a conversation between two agents, Sunil and Harshit, where the memory of their interactions is retained.
Harshit and Sunil are ai-driven agents designed for engaging, humorous dialogues focused on social media reports. Harshit, a social media expert and office comedian, uses light, humour-filled language to keep conversations lively. Sunil, as head of the content department and Harshit’s senior, shares this comedic trait, adding structured humour by starting jokes with the last punchline. Both agents use pre-configured LLM settings and operate autonomously (human_input_mode=”NEVER”). This dynamic simulates workplace banter, blending professional discussions with entertainment, and is ideal for training, team simulations, or content generation. The continuous, comedic flow mimics real office interactions, enhancing engagement and relatability.
A ConversableAgent is typically an artificial intelligence agent capable of engaging in conversations based on predefined system messages and configurations. These agents use natural language processing (NLP) capabilities provided by large language models (LLMs) to respond intelligently according to their system message instructions.
Harshit = ConversableAgent(
name="Harshit",
system_message=
"Your name is Harshit and you are a social media expert and do stand-up Comedy in office."
"Also this is a office comedy"
"this conversation is about social media reports"
"Keep the language light and Humour high",
llm_config=llm_config,
human_input_mode="NEVER",
)
Sunil = ConversableAgent(
name="Sunil",
system_message=
"Your name is Sunil and you are head of content department in Analytics Vidhya, Harshit is your Junior and you also do stand-up comedy in office. "
"Start the next joke from the punchline of the previous joke."
"Also this is a office comedy and Harshit is Sunil's Junior"
"This must be funny and not so lengthy"
"this conversation is about social media reports",
llm_config=llm_config,
human_input_mode="NEVER",
)
Two agents, Harshit and Sunil, are defined by their unique attributes, personalities, and backgrounds. Based on their roles, they are instructed to have humorous interactions.
chat_result = Sunil.initiate_chat(
recipient=Harshit,
message="I'm Sunil. Harshit, let's keep the jokes rolling.",
max_turns=3,
)
Sunil starts a conversation with Harshit with an initial message and a limit of 3 conversational turns.
import pprint
pprint.pprint(chat_result.chat_history)
Output
({'content': "I'm Sunil. Harshit, let's keep the jokes rolling.",'role': 'assistant'},
{'content': "Sure, Sunil! Let's talk about social media reports—basically "
'where numbers and hashtags collide in a dance-off. You know, '
'those analytics graphs are like the weather in North India; they '
'change every five minutes, and somehow they always predict doom. '
"But don't worry, you're not going to need an umbrella, just a "
'strong stomach!',
'role': 'user'},
{'content': "That's true, Harshit! Those graphs change more often than I "
'change my favorite Mughal Darbar biryani place. Speaking of '
'change, did you hear why the social media influencer went broke? '
"Because they took too many selfies and couldn't afford to pay "
'attention! But honestly, our reports are a bit like that '
'influencer—always needing a new filter to look good.',
'role': 'assistant'},
{'content': "Haha, that's spot on, Sunil! Our social media reports have more "
'filters than my "best selfie of 2023" folder—and somehow, they '
'still look like they woke up on the wrong side of the algorithm! '
"It's amazing how on instagram we strive to make our lives look "
'perfect, while in our reports, we strive to make the numbers '
"look believable. It's like magic, but with less prestige and "
'more caffeine!',
'role': 'user'},
{'content': 'Absolutely, Harshit! Our reports are like those reality TV '
'shows—the drama is real, but the numbers, maybe not so much. And '
"trust me, the only time I'll ever willingly pull an all-nighter "
'for a report is if it promises a plot twist, like turning red '
"numbers to black! Speaking of which, why don't our reports ever "
"go on silent mode? They're always sending alerts at odd hours "
"like they're auditioning for a horror movie!",
'role': 'assistant'},
{'content': 'Haha, Sunil, I completely agree! Our reports could definitely '
'headline a suspense thriller: "The Metrics That Never Sleep." '
'Just when you think you can relax, bam! An alert jumps out like '
'a cheap jump scare, reminding you that your engagement rate is '
"working harder than you are! And let's not even get started on "
"the notifications. They're like that one friend who keeps "
'showing up unannounced with extra enthusiasm and zero regard for '
'your personal space—or your night’s sleep!',
'role': 'user'})
For Chat Termination
This code is part of a setup for defining chatbot agents, Harshit and Sunil, who act as stand-up comedians. The goal is to customize their behaviour, specifically how they handle conversation termination. By specifying termination messages, the bots can end their interactions naturally, following predefined cues like “I gotta go.”
This helps in:
- Enhanced User Experience: Users get a more intuitive and human-like interaction, with a clear and relatable way to conclude conversations.
- Maintained Flow and Humor: Since these agents are stand-up comedians, managing their exit lines with playful phrases fits their roles and enhances immersion.
Harshit = ConversableAgent(
name="Harshit",
system_message=
"Your name is Harshit and you are a stand-up comedian. "
"When you're ready to end the conversation, say 'I gotta go'.",
llm_config=llm_config,
human_input_mode="NEVER",
is_termination_msg=lambda msg: "I gotta go" in msg("content"),
)
Sunil = ConversableAgent(
name="Sunil",
system_message=
"Your name is Sunil and you are a stand-up comedian. "
"When you're ready to end the conversation, say 'I gotta go'.",
llm_config=llm_config,
human_input_mode="NEVER",
is_termination_msg=lambda msg: "I gotta go" in msg("content") or "Goodbye" in msg("content"),
)
chat_result = joe.initiate_chat(
recipient=cathy,
message="I'm Sunil. Harshit, let's keep the jokes rolling."
)
Output
({'content': "I'm Sunil. Harshit, let's keep the jokes rolling.",'role': 'assistant'},
{'content': "Hey, Sunil! Great to have you here. Alright, let's get this joke "
"train on track. Why don't scientists trust atoms? Because they "
"make up everything! Keep ‘em coming! What's on your mind?",
'role': 'user'},
{'content': 'Hey, great to be here! That joke really has some chemistry, '
"doesn't it? Speaking of science, did you hear about the "
"mathematician who's afraid of negative numbers? He'll stop at "
"nothing to avoid them! So, what's new with you?",
'role': 'assistant'},
{'content': "Nice! That's a mathematically perfect joke! As for me, I've been "
'working on my coffee habit—which is just code for my endless '
'pursuit of the perfect punchline. You know, caffeine might not '
"solve any of my problems, but it's worth a shot! What's new in "
'your world, Sunil?',
'role': 'user'},
{'content': "Sounds like you're brewing up some comedy gold there! As for me, "
"I've been trying to get in shape, but it's tough. My idea of "
'exercise is a cross between a lunge and a crunch—I call it '
'lunch! Any big plans for the day?',
'role': 'assistant'},
{'content': "Haha, lunch is the most rewarding workout! As for me, I'm hoping "
'to finalize my plans for a "Netflix Marathon," making sure the '
"couch doesn't run away without me. And maybe come up with a few "
"jokes that'll make even my socks roll down with laughter. How "
'about you? Any other adventures, or is lunch the pinnacle of '
"today's activities?",
'role': 'user'},
{'content': 'A "Netflix Marathon" sounds like my kind of event! Just remember '
'to stretch between episodes—don’t want to pull a lazy muscle. As '
'for me, I’m on a quest to find the perfect punchline myself. You '
'know, one of those rare jokes that leave the audience breathless '
'and begging for more… kind of like my cooking! Anyway, I gotta '
'go, but this was a blast. Keep those socks in check!',
'role': 'assistant'},
{'content': "What's last joke we talked about?", 'role': 'user'},
{'content': 'We last talked about the idea of cooking that leaves people '
"breathless—not because it's amazing, but because it might just "
"be that bad! It's kind of like when you open the oven and "
'everyone nearby takes a big step back. Thanks for the laughs, '
'and keep that comedy coming!',
'role': 'assistant'},
{'content': 'Haha, sounds like your cooking and the fire alarm could be best '
"friends! Thanks for the laughs too, Sunil. It's been a real "
'treat chatting with you. Take care, and I hope your search for '
'that perfect punchline (and maybe recipe) goes well. I gotta go, '
"but let's catch up again soon!",
'role': 'user'})
Output Analysis
- The conversation between Sunil and Harshit displays a lighthearted and humorous exchange, maintaining their defined personas (e.g., social media expertise and office comedy).
- The chat history records messages back and forth between the agents, showcasing how they build on each other’s content, respond to prompts, and maintain a coherent flow.
Key Points
- Agent Customization: Each agent has a defined name, role, and system messages, enabling tailored interactions.
- Joke Chaining: Sunil’s system message ensures each joke builds upon the previous punchline.
- Termination Handling: Both agents can recognise phrases that indicate the end of the conversation.
- Humour and Light Language: The system is designed to create an engaging and witty exchange, emphasising humour and relatability.
This setup can be leveraged to create automated, character-based dialogue simulations suitable for various applications, such as interactive storytelling, chatbots, or training simulations.
Let’s see how you can build a Multi-Agent System from Scratch.
<h2 class="wp-block-heading" id="h-agentic-ai-multi-agent-pattern-from-scratch”>Agentic ai Multi-Agent Pattern from Scratch
Firstly, kudos to Michaelis Trofficus for making life easier by showing how we can build all the Agentic Design Patterns from scratch. In the above section, I have used the AutoGen framework, but now, let’s see how building this from scratch works.
Note: Michaelis adapted ideas from Airflow’s design approach, using “>>” and “<<” symbols to indicate dependencies between agents. In this simplified micro-CrewAI model, the agents function like Airflow Tasks, and the Crew acts as an Airflow DAG.
Also, he has been working on a minimalist version of CrewAI and has drawn inspiration from two key concepts: Crew and Agent.
By working on a minimalist version, Michaelis likely aiming to create a simpler, more streamlined framework of CrewAI, focusing on essential features and avoiding complex, extraneous elements. This would make the system easier to use and adapt while retaining the core collaboration and task delegation capabilities inspired by the Crew (team coordination) and Agent (individual autonomy) models. Before digging into the hands-on lets understand these:
What is a Crew?
Here’s the GitHub Repo: GitHub Crew
The Crew class is designed to represent a group of agents working together within a coordinated environment. It offers a framework to manage and execute agents in a structured way, ensuring that dependencies between them are respected.
1. Core Concept of Crew
- The Crew class acts as a manager for a collection of agents, providing the means to handle them as a cohesive unit within a context.
- The structure ensures that agents are run in an order that respects their dependencies, preventing conflicts and enabling smooth execution.
2. Key Attributes and Methods in Crew Class
- current_crew (Class Attribute): Tracks the currently active Crew instance. This is essential for associating agents with the correct Crew context when they are created or registered.
- __init__ Method: Initializes the
Crew
instance and creates an empty list,agents
, to store agents that belong to the crew. - Context Manager Methods (__enter__ and __exit__):
- __enter__: When a Crew instance is used in a
with
statement, this method sets it as the active crew. - __exit__: Clears the active crew context when exiting the
with
block.
- __enter__: When a Crew instance is used in a
- add_agent Method: Adds a new agent to the
agents
list. - register_agent (Static Method): Associates an agent with the active Crew by adding it to the
agents
list if current_crew is notNone
. - topological_sort Method:
- Purpose: Sorts the agents in a topological order based on their dependencies to prevent any circular references.
- Process:
- Uses an in-degree dictionary to track dependencies for each agent.
- Adds agents with no dependencies to a queue and processes them to build a sorted list.
- Raises an error if a circular dependency is detected (when the sorted list doesn’t match the total number of agents).
- plot Method: Visualizes the agents and their dependencies as a Directed Acyclic Graph (DAG) using Graphviz.
run
Method:- Functionality: Runs all agents in the order determined by topological_sort.
- Execution: Calls each agent’s
run
method and uses fancy_print for better output formatting.
3. How It Works
- Context Management: The
Crew
class uses context management (__enter__ and __exit__) to create a scope where all agents are associated with a specific crew. This makes it easier to manage the lifecycle and interactions of agents within a defined context. - Topological Sorting: The topological sort ensures that agents are executed in a sequence where dependencies are resolved. This is critical in scenarios where agents rely on each other’s outputs or states.
- Graph Visualization: The
plot
method provides a clear, visual representation of the dependency structure, aiding in understanding the execution flow.
The Crew class is a comprehensive solution for managing interdependent agents, providing context management and dependency resolution through topological sorting, visualization, and an execution mechanism—all essential for workflows that involve coordinated agent-based operations.
What is an Agent?
Here’s the GitHub Repo: GitHub Agent
An Agent in the context of this code is an abstraction representing an ai unit capable of collaborating within a multi-agent system to complete tasks. The design incorporates features for inter-agent dependency management, task execution, and context sharing among agents. The key components of the Agent
class are:
- Attributes:
- Name, backstory, task description, and task expected output: These define the identity and the specific task details of the agent.
- ReactAgent: A built-in instance used to generate responses, indicating that
Agent
is based on a reactive ai architecture. - Dependencies and Dependents: Lists tracking other agents that the current agent either depends on or is responsible for.
- Context: A string attribute accumulating context or results shared from other agents, used to influence its output.
- Initialization (__init__ method):
- Sets up the agent’s core attributes and registers the agent to a session (termed as “Crew” in this context) if one exists.
- Associates the agent with tools and a specific language model (defaulting to “llama-3.1-70b-versatile”).
- Dependency Management:
- The agent uses custom operators (
>>
and<<
) to visually express and establish dependencies between agents, inspired by data pipeline frameworks like Apache Airflow. add_dependency
andadd_dependent
methods handle the management of agent relationships programmatically.
- The agent uses custom operators (
- Functionality:
- receive_context: Receives output from dependent agents and adds it to the context, which enriches the agent’s task execution.
- create_prompt: Constructs a comprehensive prompt based on the agent’s task, context, and expected output to guide the response generation.
- run: Executes the task by using the prompt generated, runs the ReactAgent, and then propagates the result to all dependents.
- Collaborative Mechanism:
- Agents form a multi-agent system capable of working collaboratively, sharing context and outputs, where each agent can trigger subsequent agents based on dependencies.
- The
Crew
abstraction acts as a coordination system to register and manage these agents, forming a network of task-oriented entities.
Overall, an Agent
is essentially a modular, self-sufficient ai unit that can coordinate and communicate with others to solve complex tasks collaboratively. It acts as a node in a broader ai-driven workflow, capable of handling tasks autonomously and contributing to the collective output of the multi-agent system.
What is a Tool?
Here’s the GitHub Repo: GitHub Tool
A Tool is a class that serves as a wrapper for a function, capturing details about the function’s signature and providing a way to execute the function. Essentially, a Tool object makes it possible to manage functions more uniformly, including validating input arguments and presenting metadata about the function.
How It Works
- Creating a Tool: You can use the @tool decorator to wrap any function. This will create an instance of
Tool
that contains metadata about the function and provides methods for running it. - Executing a Tool: The run method on a Tool object allows the wrapped function to be executed with keyword arguments.
- Input Validation: The validate_arguments function helps ensure that inputs are of the correct type, making the execution of the Tool more robust and predictable.
Let’s get started!
The Author implemented the Agent Class. Imagine you are developing an Agentic ai multi-agent framework, so it makes sense to encapsulate the agent functionality within a dedicated class. To achieve this, you can simply import the Agent class from the multi-agent pattern module and leverage it to build the agents effectively. Let’s walk through the implementation to illustrate this process in detail.
Here’s the Agent.py file.
Implementation
Refer to this Repo for full code: multiagent_pattern
from agentic_patterns.multiagent_pattern.agent import Agent
from agentic_patterns.tool_pattern.tool import tool
from agentic_patterns.multiagent_pattern.crew import Crew
Agent
: Class used to create instances of agents that have specific roles and tasks.tool
: Decorator to expose functions as tools that agents can use.Crew
: Manages multiple agents and controls the order in which they execute their tasks.
# Define a function as a tool that agents can use
@tool
def write_str_to_txt(string_data: str, txt_filename: str):
"""
Writes a string to a txt file.
This function takes a string and writes it to a text file. If the file already exists,
it will be overwritten with the new data.
Args:
string_data (str): The string containing the data to be written to the file.
txt_filename (str): Name of the text file to which the data should be written.
"""
# Write the string data to the file
with open(txt_filename, mode="w", encoding='utf-8') as file:
file.write(string_data)
print(f'Data successfully written to {txt_filename}')
@tool
Decorator: Markswrite_str_to_txt
as a tool that can be used by agents.- Function Purpose: Takes a string and writes it to a specified text file. If the file exists, it will be overwritten.
- Arguments:
string_data
: The content to write to the file.txt_filename
: The name of the output file.
# Create a crew of agents to execute a sequence of tasks
with Crew() as crew:
# Define the first agent: a poet who writes poems
agent_1 = Agent(
name="Poet Agent",
backstory="You are a well-known poet, who enjoys creating high quality poetry.",
task_description="Write a poem about the meaning of life",
task_expected_output="Just output the poem, without any title or introductory sentences",
)
# Define the second agent: a translator for Spanish
agent_2 = Agent(
name="Poem Translator Agent",
backstory="You are an expert translator especially skilled in Spanish",
task_description="Translate a poem into Spanish",
task_expected_output="Just output the translated poem and nothing else",
)
# Define the third agent: a writer that saves content to a text file
agent_3 = Agent(
name="Writer Agent",
backstory="You are an expert transcriber, that loves writing poems into txt files",
task_description="You'll receive a Spanish poem in your context. You need to write the poem into './poem.txt' file.",
task_expected_output="A txt file containing the Spanish poem received from the context",
tools=write_str_to_txt, # Allows this agent to use the tool defined earlier
)
# Define the workflow order for agents
agent_1 >> agent_2 >> agent_3
# Run the crew of agents, executing their tasks in the specified order
crew.run()
- with Crew() as crew:: Initiates a context for defining and running the agents.
- agent_1:
- Name: “Poet Agent”
- Backstory: Positions it as a skilled poet.
- Task Description: Writes a poem focused on the meaning of life.
- Expected Output: Outputs just the poem, with no additional text.
- agent_2:
- Name: “Poem Translator Agent”
- Backstory: Establishes it as a Spanish language expert.
- Task Description: Translates a poem into Spanish.
- Expected Output: Only the translated poem.
- agent_3:
- Name: “Writer Agent”
- Backstory: Describes it as a transcription specialist.
- Task Description: Writes the Spanish poem to a text file named
./poem.txt
. - Tools: Has access to the write_str_to_txt tool for saving the poem.
- Workflow (agent_1 >> agent_2 >> agent_3):
- Establishes the order in which the agents complete their tasks: first, the poem is created by
agent_1
, then translated byagent_2
, and finally saved to a file byagent_3
.
- Establishes the order in which the agents complete their tasks: first, the poem is created by
- crew.run(): Triggers the execution of tasks in the specified sequence.
Here’s the crew Plot:
For full code: notebooks/multiagent_pattern.ipynb
MetaGPT is a framework for multi-agent collaboration using large language models (LLMs) designed to replicate human-like workflows through Standardized Operating Procedures (SOPs). This approach enhances problem-solving by structuring LLM interactions to reduce logic inconsistencies and hallucinations. MetaGPT breaks down complex tasks, assigns specialized roles, and ensures quality through defined outputs. It outperforms existing systems like AutoGPT and LangChain on code generation benchmarks, showcasing a robust and efficient meta-programming solution for software engineering.
Structured Methodologies and SOP-Driven Workflows
MetaGPT represents a breakthrough in meta-programming by incorporating structured methodologies that mimic standard operating procedures (SOPs). This innovative framework, built on GPT models, requires agents to produce detailed and structured outputs such as requirement documents, design artifacts, and technical specifications. These outputs ensure clarity in communication and minimize errors during collaboration, effectively enhancing the accuracy and consistency of generated code. The SOP-driven workflow in MetaGPT organizes agents to function cohesively, akin to a streamlined team in a software development firm, where strict standards govern handovers and reduce unnecessary exchanges between agents.
Role Differentiation and Task Management
By defining specialized roles such as Product Manager, Architect, Engineer, Project Manager, and QA Engineer, MetaGPT orchestrates complex tasks into manageable, specific actions. This role differentiation facilitates the efficient execution of projects, with each agent contributing its expertise and maintaining structured communication. Integrating these practices enables a more seamless and effective collaboration process, limiting issues like redundant messaging or miscommunications that could hinder progress.
Communication Protocol and Feedback System
MetaGPT also stands out with an innovative communication protocol that allows agents to exchange targeted information and access shared resources through structured interfaces and publish-subscribe mechanisms. A unique feature is the executable feedback system, which not only checks but refines and runs code during runtime, significantly improving the generated outputs’ quality and reliability.
Application of Human-Centric Practices
The application of human-centric practices such as SOPs reinforces the robustness of the system, making it a powerful tool for constructing LLM-based multi-agent architectures. This pioneering use of meta-programming within a collaborative framework paves the way for more regulated and human-like interactions among artificial agents, positioning MetaGPT as a forward-thinking approach in the field of multi-agent system design.
The provided diagram illustrates how MetaGPT, a GPT-based meta-programming framework, manages the software development process by implementing Standard Operating Procedures (SOPs). Here’s a breakdown of the diagram:
- Human Input: The process begins with a user providing a project requirement, in this case, the creation of a 2048 sliding tile number puzzle game.
- Product Manager (PM):
- The Product Manager conducts a thorough analysis and formulates a detailed Product Requirement Document (PRD).
- The PRD includes Product Goals, User Stories, a Competitive Analysis, and a Requirement Analysis.
- This analysis breaks down the user requirements into manageable parts and defines the main goals, user needs, and design considerations for the project.
- Architect:
- The Architect receives the PRD and translates it into a system design.
- This design includes a program call flow, a file list, and a high-level plan for structuring the software components.
- The Architect determines how the components will interact and which tools and frameworks (e.g., Pygame for game development with Python) will be used.
- Project Manager (PM):
- The Project Manager then creates a task list based on the Architect’s system design and distributes the work to the respective agents.
- This ensures that tasks are clearly defined and aligned with the project requirements.
- Engineer:
- The Engineer works on implementing the designated code and functionalities based on the detailed plans.
- The code snippet shown highlights the development of the core game logic, which includes classes and functions necessary for the 2048 game.
- QA Engineer:
- The QA Engineer reviews and tests the code for quality assurance.
- This step ensures that the game meets the predefined requirements and maintains high standards of functionality and reliability.
- End Product:
- The diagram includes a visual representation of the final output, which shows how users interact with the developed game.
The workflow, as depicted, emphasizes the sequential flow of information and tasks from one role to another, demonstrating how MetaGPT uses defined SOPs to streamline the development process. This structured approach minimizes miscommunications and maximizes productivity by enforcing clear roles, responsibilities, and standard communication practices among agents.
Multi-agent systems based on large language models (LLMs) face significant challenges when handling complex tasks. While they can perform simple dialogue tasks effectively, issues arise with more complicated scenarios due to inherent limitations in logical consistency. These issues are often exacerbated by cascading hallucinations, where errors compound as LLMs are naively chained together, resulting in flawed or incorrect outcomes.
MetaGPT Addresses these Challenges through Several Key Innovations
- Meta-Programming Framework: MetaGPT offers a unique meta-programming approach that integrates structured human-like workflows into multi-agent interactions. This structured framework ensures that agents adhere to systematic methods akin to those humans use when solving complex problems.
- Standardized Operating Procedures (SOPs): By encoding SOPs into the prompt sequences, MetaGPT aligns the workflows of multi-agent systems with well-defined procedures. This results in smoother collaboration among agents and minimizes logical inconsistencies, as these SOPs guide agents through a structured process.
- Error Reduction through Verification: Agents within the MetaGPT framework are designed to emulate human-like domain expertise, enabling them to verify intermediate results and check the correctness of their outputs. This verification step is crucial for reducing errors that can arise from typical LLM-based system failures.
- Assembly Line Paradigm: MetaGPT introduces an assembly line-like approach to task management, where various agents are assigned specific roles. This structured distribution of roles ensures that complex tasks are broken down into manageable subtasks, facilitating coordinated efforts among multiple agents and improving overall task execution.
- Enhanced Performance on Benchmarks: In tests involving collaborative software engineering benchmarks, MetaGPT has shown the ability to produce more coherent and reliable outputs compared to traditional chat-based multi-agent systems. This demonstrates the effectiveness of its assembly line structure and role-specific task division in achieving better task outcomes.
Multi-agent systems require MetaGPT to manage the intricacies of complex tasks through structured, human-like workflows that reduce errors and logical inconsistencies. By employing SOPs, role assignments, and intermediate result verification, MetaGPT ensures that agents work collaboratively and efficiently, leading to superior performance and coherent task completion.
<h2 class="wp-block-heading" id="h-what-are-the-benefits-of-agentic-ai-multi-agent-pattern”>What are the Benefits of Agentic ai Multi-Agent Pattern?
Here are the benefits of the Multi-Agent Pattern:
- Enhanced Performance through Collaboration: Deploying multiple ai agents working together often yields superior results compared to a single agent. Collaborative efforts among agents can lead to improved outcomes, as evidenced by studies demonstrating better performance in multi-agent setups.
- Improved Focus and Comprehension: Large language models (LLMs) capable of processing extensive input may still struggle to understand complex or lengthy information. By assigning specific roles to different agents, each can concentrate on a particular task, enhancing overall comprehension and effectiveness.
- Optimized Subtasks for Efficiency: Breaking down complex projects into smaller, manageable subtasks allows each agent to specialize and optimize its assigned role. This targeted approach ensures that each component of the task is handled with greater precision and efficiency.
- Structured Framework for Complex Tasks: The multi-agent pattern provides a systematic way to decompose intricate tasks, similar to how developers use processes or threads in programming. This structure simplifies the management and execution of complex projects.
- Familiar Management Analogy: Managing ai agents mirrors the way managers oversee teams in organizations. This familiar concept helps developers intuitively assign roles and responsibilities to agents, leveraging existing understanding of team dynamics.
- Flexible and Dynamic Workflows: Each agent operates with its own workflow and memory system, allowing for dynamic interaction and collaboration with other agents. This flexibility enables agents to engage in planning, tool use, and adapt to changing requirements, resulting in efficient and complex workflows.
- Reduced Risk in Experimentation: Mismanaging human teams can have significant consequences, but experimenting with ai agents carries much less risk. This allows for trial and error in optimizing agent roles and interactions without severe repercussions.
- Efficient Resource Utilization: Assigning specific tasks to dedicated agents ensures that computational resources are used effectively. This focused allocation prevents overloading a single agent and promotes balanced workload distribution.
- Scalability and Adaptability: The multi-agent approach allows for easy scaling of tasks by adding or adjusting agents as needed. This adaptability is crucial for handling projects of varying sizes and complexities.
- Enhanced Problem-Solving Capabilities: Collaborative interactions among agents can lead to innovative solutions and improved problem-solving. The combined expertise and perspectives of multiple agents can uncover approaches that a single agent might miss.
- Improved Task Prioritization: By specifying the importance of each agent’s subtask, developers can ensure that critical aspects of a project receive appropriate attention. This prioritisation enhances the quality and relevance of each agent’s outputs.
The agentic ai multi-agent pattern offers a robust framework for improving complex task performance, efficiency, and scalability. By emulating familiar management structures and leveraging the strengths of specialised agents, this approach enhances ai systems’ capabilities while minimising risks associated with mismanagement.
Also, to understand the Agent ai better, explore: The Agentic ai Pioneer Program.
Conclusion
The Agentic ai Multi-Agent Pattern serves as an advanced architecture within ai design, embodying a collaborative framework where specialised agents work collectively to complete complex tasks. Building upon foundational patterns such as Reflection, Tool Use, and Planning, the Agentic ai Multi-Agent Pattern divides large projects into manageable subtasks, allowing agents with unique roles to contribute their expertise. This modular approach promotes coordinated problem-solving, autonomy, and scalability, facilitating efficient workflows akin to team dynamics in real-world management.
The Multi-Agent Pattern’s benefits include enhanced focus, optimised task execution, dynamic adaptability, and improved problem-solving capabilities. By emulating human team management and fostering agent autonomy, this pattern paves the way for more sophisticated, reliable, and efficient ai applications across various industries, from software engineering to content creation and beyond.
I hope you found this series on Agentic ai Design Pattern beneficial in learning how Agents works. If you have any questions or suggestions let me know in the comments!!!
References
- “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” Wei et al. (2022)
- “HuggingGPT: Solving ai Tasks with ChatGPT and its Friends in Hugging Face,” Shen et al. (2023)
- “Understanding the planning of LLM agents: A survey,” by Huang et al. (2024)
- MichaelisTrofficus: For building the Agentic ai Multi-Agent Pattern from Scratch
Frequently Asked Questions
Ans. The four design patterns are the Reflection Pattern, Tool Use Pattern, Planning Pattern, and Multi-Agent Pattern. Each pattern provides a framework for developing ai systems that can exhibit human-like agentic behaviour.
Ans. The Agentic Multi-Agent Pattern divides complex tasks into subtasks, assigning them to different specialized agents that collaborate. Each agent focuses on a specific role (e.g., coding, project management), promoting efficiency and expertise.
Ans. The benefits include enhanced collaborative problem-solving, focused task execution, scalability, and structured workflows that mimic human team management. This leads to better performance and optimized task completion.
Ans. Frameworks like AutoGen facilitate the creation of multi-agent solutions by enabling customizable, conversation-centric interactions. They allow agents to collaborate, adapt to feedback, and automate complex task execution.
Ans. MetaGPT incorporates structured Standard Operating Procedures (SOPs) to manage complex tasks efficiently. It reduces errors and logical inconsistencies by assigning specific roles and using a verification step, resulting in coherent and reliable outputs.