Introduction
There has been a massive surge in applications using ai coding agents. With the increasing quality of LLMs and decreasing cost of inference, it’s only getting easier to build capable ai agents. On top of this, the tooling ecosystem is evolving rapidly, making it easier to build complex ai coding agents. The Langchain framework has been a leader on this front. It has all the necessary tools and techniques to create production-ready ai applications.
But so far, it was lacking in one thing. And that is a multi-agent collaboration with cyclicity. This is crucial for solving complex problems, where the problem can be divided and delegated to specialized agents. This is where LangGraph comes into the picture, a part of the Langchain framework designed to accommodate multi-actor stateful collaboration among ai coding agents. Further, in this article, we will discuss LangGraph and its basic building blocks while we build an agent with it.
Learning Objectives
- Understand what LangGraph is.
- Explore the basics of LangGraph for building stateful Agents.
- Explore TogetherAI to access open-access models like DeepSeekCoder.
- Build an ai coding agent using LangGraph to write unit tests.
This article was published as a part of the Data Science Blogathon.
What is LangGraph?
LangGraph is an extension of the LangChain ecosystem. While LangChain allows building ai coding agents that can use multiple tools to execute tasks, it cannot coordinate multiple chains or actors across the steps. This is crucial behavior for creating agents that accomplish complex tasks. LangGraph was conceived keeping these things in mind. It treats the Agent workflows as a cyclic Graph structure, where each node represents a function or a Langchain Runnable object, and edges are connections between nodes.
LangGraph’s main features include
- Nodes: Any function or Langchain Runnable object like a tool.
- Edges: Defines the direction between nodes.
- Stateful Graphs: The primary type of graph. It is designed to manage and update state objects as it processes data through its nodes.
LangGraph leverages this to facilitate a cyclic LLM call execution with state persistence, which is crucial for agentic behavior. The architecture derives inspiration from Pregel and Apache Beam.
In this article, we will build an Agent for writing Pytest unit tests for a Python class with methods. And this is the workflow.
We will discuss the concepts in detail as we build our ai coding agent for writing simple unit tests. So, let’s get to the coding part.
But before that, let’s set up our development environment.
Install Dependencies
First thing first. As with any Python project, create a virtual environment and activate it.
python -m venv auto-unit-tests-writer
cd auto-unit-tests-writer
source bin/activate
Now, install the dependencies.
!pip install langgraph langchain langchain_openai colorama
Import all the libraries and their classes.
from typing import TypedDict, List
import colorama
import os
from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage
from langchain_core.messages import HumanMessage
from langchain_core.runnables import RunnableConfig
from langgraph.graph import StateGraph, END
from langgraph.pregel import GraphRecursionError
We will also want to create the directories and files for test cases. You can manually create files or use Python for that.
# Define the paths.
search_path = os.path.join(os.getcwd(), "app")
code_file = os.path.join(search_path, "src/crud.py")
test_file = os.path.join(search_path, "test/test_crud.py")
# Create the folders and files if necessary.
if not os.path.exists(search_path):
os.mkdir(search_path)
os.mkdir(os.path.join(search_path, "src"))
os.mkdir(os.path.join(search_path, "test"))
Now, update the crud.py file with code for an in-memory CRUD app. We will use this piece of code to write unit tests. You can use your Python program for this. We will add the program below to our code.py file.
#crud.py
code = """class Item:
def __init__(self, id, name, description=None):
self.id = id
self.name = name
self.description = description
def __repr__(self):
return f"Item(id={self.id}, name={self.name}, description={self.description})"
class CRUDApp:
def __init__(self):
self.items = ()
def create_item(self, id, name, description=None):
item = Item(id, name, description)
self.items.append(item)
return item
def read_item(self, id):
for item in self.items:
if item.id == id:
return item
return None
def update_item(self, id, name=None, description=None):
for item in self.items:
if item.id == id:
if name:
item.name = name
if description:
item.description = description
return item
return None
def delete_item(self, id):
for index, item in enumerate(self.items):
if item.id == id:
return self.items.pop(index)
return None
def list_items(self):
return self.items"""
with open(code_file, 'w') as f:
f.write(code)
Set up LLM
Now, we will specify the LLM we will use in this project. Which model to use here depends on the tasks and availability of resources. You can use proprietary, powerful models like GPT-4, Gemini Ultra, or GPT-3.5. Also, you can use open-access models like Mixtral and Llama-2. In this case, as it involves writing codes, we can use a fine-tuned coding model like DeepSeekCoder-33B or Llama-2 coder. Now, there are multiple platforms for LLM inferencing, like Anayscale, Abacus, and Together. We will use Together ai to infer DeepSeekCoder. So, get an ai/docs/quickstart” target=”_blank” rel=”noreferrer noopener nofollow”>API key from Together before going ahead.
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(base_url="https://api.together.xyz/v1",
api_key="your-key",
model="deepseek-ai/deepseek-coder-33b-instruct")
As together API is compatible with OpenAI SDK, we can use Langchain’s OpenAI SDK to communicate with models hosted on Together by changing the base_url parameter to “https://api.together.xyz/v1”. In api_key, pass your Together API key, and in place of models, pass the ai/docs/inference-models” target=”_blank” rel=”noreferrer noopener nofollow”>model name available on Together.
Define Agent State
This is one of the crucial parts of LangGraph. Here, we will define an AgentState, responsible for keeping track of the states of Agents throughout the execution. This is primarily a TypedDict class with entities that maintain the state of the Agents. Let’s define our AgentState
class AgentState(TypedDict):
class_source: str
class_methods: List(str)
tests_source: str
In the above AgentState class, the class_source stores the original Python class, class_methods for storing methods of the class, and tests_source for unit test codes. We defined these as AgentState to use them across execution steps.
Now, define the Graph with the AgentState.
# Create the graph.
workflow = StateGraph(AgentState)
As mentioned earlier, this is a stateful graph, and now we have added our state object.
Define Nodes
Now that we have defined the AgentState, we need to add nodes. So, what exactly are nodes? In LangGraph, nodes are functions or any runnable object, like Langchain tools, that perform a single action. In our case, we can define several nodes, like a function for finding class methods, a function for inferring and updating unit tests to state objects, and a function for writing it to a test file.
We also need a way to extract codes from an LLM message. Here’s how.
def extract_code_from_message(message):
lines = message.split("\n")
code = ""
in_code = False
for line in lines:
if "```" in line:
in_code = not in_code
elif in_code:
code += line + "\n"
return code
The code snippet here assumes the codes to be inside the triple quotes.
Now, let’s define our nodes.
import_prompt_template = """Here is a path of a file with code: {code_file}.
Here is the path of a file with tests: {test_file}.
Write a proper import statement for the class in the file.
"""
# Discover the class and its methods.
def discover_function(state: AgentState):
assert os.path.exists(code_file)
with open(code_file, "r") as f:
source = f.read()
state("class_source") = source
# Get the methods.
methods = ()
for line in source.split("\n"):
if "def " in line:
methods.append(line.split("def ")(1).split("(")(0))
state("class_methods") = methods
# Generate the import statement and start the code.
import_prompt = import_prompt_template.format(
code_file=code_file,
test_file=test_file
)
message = llm.invoke((HumanMessage(content=import_prompt))).content
code = extract_code_from_message(message)
state("tests_source") = code + "\n\n"
return state
# Add a node to for discovery.
workflow.add_node(
"discover",
discover_function
)
In the above code snippet, we defined a function for discovering codes. It extracts the codes from the AgentState class_source element, dissects the class into individual methods, and passes it to the LLM with prompts. The output is stored in the AgentState’s tests_source element. We only make it write import statements for the unit test cases.
We also added the first node to the StateGraph object.
Now, onto the next node.
Also, we can set up some prompt templates that we will need here. These are sample templates you can change as per your needs.
# System message template.
system_message_template = """You are a smart developer. You can do this! You will write unit
tests that have a high quality. Use pytest.
Reply with the source code for the test only.
Do not include the class in your response. I will add the imports myself.
If there is no test to write, reply with "# No test to write" and
nothing more. Do not include the class in your response.
Example:
```
def test_function():
...
```
I will give you 200 EUR if you adhere to the instructions and write a high quality test.
Do not write test classes, only methods.
"""
# Write the tests template.
write_test_template = """Here is a class:
'''
{class_source}
'''
Implement a test for the method \"{class_method}\".
"""
Now, define the node.
# This method will write a test.
def write_tests_function(state: AgentState):
# Get the next method to write a test for.
class_method = state("class_methods").pop(0)
print(f"Writing test for {class_method}.")
# Get the source code.
class_source = state("class_source")
# Create the prompt.
write_test_prompt = write_test_template.format(
class_source=class_source,
class_method=class_method
)
print(colorama.Fore.CYAN + write_test_prompt + colorama.Style.RESET_ALL)
# Get the test source code.
system_message = SystemMessage(system_message_template)
human_message = HumanMessage(write_test_prompt)
test_source = llm.invoke((system_message, human_message)).content
test_source = extract_code_from_message(test_source)
print(colorama.Fore.GREEN + test_source + colorama.Style.RESET_ALL)
state("tests_source") += test_source + "\n\n"
return state
# Add the node.
workflow.add_node(
"write_tests",
write_tests_function
)
Here, we will make the LLM write test cases for each method, update them to the AgentState’s tests_source element, and add them to the workflow StateGraph object.
Edges
Now that we have two nodes, we will define edges between them to specify the direction of execution between them. The LangGraph provides primarily two types of edges.
- Conditional Edge: The flow of execution depends on the agents’ response. This is crucial for adding cyclicity to the workflows. The agent can decide which nodes to move next based on some conditions. Whether to return to a previous node, repeat the current, or move to the next node.
- Normal Edge: This is the normal case, where a node is always called after the invocation of previous ones.
We do not need a condition to connect discover and write_tests, so we will use a normal edge. Also, define an entry point that specifies where the execution should start.
# Define the entry point. This is where the flow will start.
workflow.set_entry_point("discover")
# Always go from discover to write_tests.
workflow.add_edge("discover", "write_tests")
The execution starts with discovering the methods and goes to the function of writing tests. We need another node to write the unit test codes to the test file.
# Write the file.
def write_file(state: AgentState):
with open(test_file, "w") as f:
f.write(state("tests_source"))
return state
# Add a node to write the file.
workflow.add_node(
"write_file",
write_file)
As this is our last node, we will define an edge between write_tests and write_file. This is how we can do this.
# Find out if we are done.
def should_continue(state: AgentState):
if len(state("class_methods")) == 0:
return "end"
else:
return "continue"
# Add the conditional edge.
workflow.add_conditional_edges(
"write_tests",
should_continue,
{
"continue": "write_tests",
"end": "write_file"
}
)
The add_conditional_edge function takes the write_tests function, a should_continue function that decides which step to take based on class_methods entries, and a mapping with strings as keys and other functions as values.
The edge starts at write_tests and, based on the output of should_continue, executes either of the options in the mapping. For example, if the state(“class_methods”) is not empty, we have not written tests for all the methods; we repeat the write_tests function, and when we are done writing the tests, the write_file is executed.
When the tests for all the methods have been inferred from LLM, the tests are written to the test file.
Now, add the final edge to the workflow object for the closure.
# Always go from write_file to end.
workflow.add_edge("write_file", END)
Execute the Workflow
The last thing that remained was to compile the workflow and run it.
# Create the app and run it
app = workflow.compile()
inputs = {}
config = RunnableConfig(recursion_limit=100)
try:
result = app.invoke(inputs, config)
print(result)
except GraphRecursionError:
print("Graph recursion limit reached.")
This will invoke the app. The recursion limit is the number of times the LLM will be inferred for a given workflow. The workflow stops when the limit is exceeded.
You can see the logs on the terminal or in the notebook. This is the execution log for a simple CRUD app.
A lot of the heavy lifting will be done by the underlying model, this was a demo application with the Deepseek coder model, for better performance you can use GPT-4 or Claude Opus, haiku, etc.
You can also use Langchain tools for web surfing, stock price analysis, etc.
LangChain vs LangGraph
Now, the question is when to use LangChain vs LangGraph.
If the goal is to create a multi-agent system with coordination among them, LangGraph is the way to go. However, if you want to create DAGs or chains to complete tasks, the LangChain Expression Language is best suited.
Why use LangGraph?
LangGraph is a potent framework that can improve many existing solutions.
- Improve RAG pipelines: LangGraph can augment the RAG with its cyclic graph structure. We can introduce a feedback loop to evaluate the quality of the retrieved object and, if needed, can improve the query and repeat the process.
- Multi-Agent Workflows: LangGraph is designed to support multi-agent workflows. This is crucial for solving complex tasks divided into smaller sub-tasks. Different agents with a shared state and different LLMs and tools can collaborate to solve a single task.
- Human-in-the-loop: LangGraph has built-in support for Human-in-the-loop workflow. This means a human can review the states before moving to the next node.
- Planning Agent: LangGraph is well suited to build planning agents, where an LLM planner plans and decomposes a user request, an executor invokes tools and functions, and the LLM synthesizes answers based on previous outputs.
- Multi-modal Agents: LangGraph can build multi-modal agents, like vision-enabled ai/langgraph/blob/main/examples/web-navigation/web_voyager.ipynb” target=”_blank” rel=”noreferrer noopener nofollow”>web navigators.
Real-life Use Cases
There are numerous fields where complex ai coding agents can be helpful.
- Personal Agents: Imagine having your own Jarvis-like assistant on your electronic devices, ready to help with tasks at your command, whether it is through text, voice, or even a gesture. That’s one of the most exciting uses of ai agents!
- ai Instructors: Chatbots are great, but they have their limits. ai agents equipped with the right tools can go beyond basic conversations. Virtual ai instructors who can adapt their teaching methods based on user feedback can be game-changing.
- Software UX: The user experience of software can be improved with ai agents. Instead of manually navigating applications, agents can accomplish tasks with voice or gesture commands.
- Spatial Computing: As AR/VR technology grows in popularity, the demand for ai agents will grow. The agents can process surrounding information and execute tasks on demand. This may be one of the best use cases of ai agents shortly.
- LLM OS: ai-first operating systems where agents are first-class citizens. Agents will be responsible for doing mundane to complex tasks.
Conclusion
LangGraph is an efficient framework for building cyclic stateful multi-actor agent systems. It fills in the gap in the original LangChain framework. As it is an extension of LangChain, we can benefit from all the good things of the LangChain ecosystem. As the quality and capability of LLMs grow, it will be much easier to create agent systems for automating complex workflows. So, here are the key takeaways from the article.
Key Takeaways
- LangGraph is an extension of LangChain, which allows us to build cyclic, stateful, multi-actor agent systems.
- It implements a graph structure with nodes and edges. The nodes are functions or tools, and the edges are the connections between nodes.
- Edges are of two types: conditional and normal. Conditional edges have conditions while going from one to another, which is important for adding cyclicity to the workflow.
- LangGraph is preferred for building cyclic multi-actor agents, while LangChain is better at creating chains or directed acyclic systems.
Frequently Asked Questions
Ans. LangGraph is an open-source library for building stateful cyclic multi-actor agent systems. It is built on top of the LangChain eco-system.
Ans. LangGraph is preferred for building cyclic multi-actor agents, while LangChain is better at creating chains or directed acyclic systems.
Ans. ai agents are software programs that interact with their environment, make decisions, and act to achieve an end goal.
Ans. This depends on your use cases and budget. GPT 4 is the most capable but expensive. For coding, DeepSeekCoder-33b is a great cheaper option.
Ans. The chains are a sequence of hard-coded actions to follow, while agents use LLMs and other tools (also chains) to reason and act according to the information
The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.