I've been exploring Hugging Face's SmolAgents to create ai agents in a few lines of code and it worked perfectly for me. From creating a research agent to Agentic Rag, it has been a seamless experience. SmolAgents by Hugging Face provides a lightweight and efficient way to create ai agents for various tasks such as research assistance, answering questions, and more. The simplicity of the framework allows developers to focus on the logic and functionality of their ai agents without getting bogged down in complex configurations. However, debugging multi-agent runs is challenging due to their unpredictable workflows and extensive logs, and most errors are often “dumb LLM” type issues that the model self-corrects in subsequent steps. Finding effective ways to validate and inspect these executions remains a key challenge. This is where OpenTelemetry It comes very well. Let's see how it works!
Why is it difficult to run the debug agent?
Here's why it's hard to run the debug agent:
- Unpredictability: ai agents are designed to be flexible and creative, meaning they don't always follow a fixed path. This makes it difficult to predict exactly what they will do, and therefore difficult to debug when something goes wrong.
- Complexity: ai agents typically perform many steps in a single run, and each step can generate many logs (messages or data about what is happening). This can quickly overwhelm you if you try to figure out what went wrong.
- Errors are usually minor: Many errors in agent execution are small errors (such as the LLM writing bad code or making a wrong decision) that the agent corrects itself in the next step. These errors are not always critical, but they still make it difficult to keep track of what is happening.
What is the importance of logging in agent execution?
Logging means recording what happens during the execution of an agent. This is important because:
- Depuration: If something goes wrong, you can check the logs to find out what happened.
- Listen: In production (when your agent is being used by real users), you need to keep an eye on its performance. Logs help you do that.
- Improvement: By reviewing logs, you can identify patterns or recurring issues and improve your agent over time.
What is OpenTelemetry?
OpenTelemetry is a standard for instrumentationmeaning it provides tools to automatically record (or “log”) what happens in your software. In this case, it is used to log agent executions.
How does it work?
- you add some instrumentation code to his agent. This code does not change how the agent works; it simply records what is happening.
- When your agent runs, OpenTelemetry automatically logs all steps, errors, and other important details.
- These logs are sent to a platform (such as a dashboard or monitoring tool) where you can review them later.
Why is this useful?
- Ease of use: No need to manually add registration code everywhere. OpenTelemetry does it for you.
- Standardization: OpenTelemetry is a widely used standard, so it works with many tools and platforms.
- Clarity: Logs are structured and organized, making it easy to understand what happened during an agent's run.
Logging agent executions is essential because ai agents are complex and unpredictable. Using OpenTelemetry makes it easy to automatically record and monitor what's happening, so you can debug issues, improve performance, and ensure everything runs smoothly in production.
How to use OpenTelemetry?
This script sets up a Python environment with specific libraries and configures OpenTelemetry for tracking. Here is a step-by-step explanation:
Here I installed the dependencies, imported the necessary modules and configured OpenTelemetry in the terminal.
Install dependencies
!pip install smolagents
!pip install arize-phoenix opentelemetry-sdk opentelemetry-exporter-otlp openinference-instrumentation-smolagents
- smolagents– A library for creating lightweight agents (probably for automation or ai tasks).
- arize-phoenix: A tool for monitoring and debugging machine learning models.
- opentelemetry-sdk: The OpenTelemetry SDK for instrumenting, generating and exporting telemetry data (traces, metrics, logs).
- opentelemetry-exporter-otlp: An exporter to send telemetry data in OTLP (OpenTelemetry Protocol) format.
- smolagents-of-open-inference-instrumentation: A library that instruments smolagents to automatically generate OpenTelemetry traces.
Import required modules
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from openinference.instrumentation.smolagents import SmolagentsInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import ConsoleSpanExporter, SimpleSpanProcessor
- trace: The OpenTelemetry tracking API.
- Tracer supplier: The core component for creating and managing traces.
- BatchSpan Processor: Process sections in batches for efficient export.
- SmolagentsInstruments: Automatically instruments smolagents to generate traces.
- OTLPSpanExporter: Export traces using the OTLP protocol over HTTP.
- ConsoleSpanExporter: Export traces to the console (for debugging).
- SimpleSpan Processor: Processes span one at a time (useful for debugging or low-volume monitoring).
Set up OpenTelemetry tracking
endpoint = "http://0.0.0.0:6006/v1/traces"
trace_provider = TracerProvider()
trace_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))
- end point: The URL where the traces will be sent (in this case, http://0.0.0.0:6006/v1/traces).
- tracking_provider: Create a new TracerProvider instance.
- add_span_processor: Adds a span processor to the provider. Here, it uses SimpleSpanProcessor to send traces to the specified endpoint via OTLPSpanExporter.
Instrument smolagents
SmolagentsInstrumentor().instrument(tracer_provider=trace_provider)
This line instruments the smolagents library to automatically generate traces using the configured trace_provider.
- Install the necessary Python libraries.
- Configure OpenTelemetry to collect smolagent traces.
- Sends traces to a specified endpoint (http://0.0.0.0:6006/v1/traces) using the OTLP protocol.
- If you want to debug, you can add a ConsoleSpanExporter to print traces to the terminal.
You will find all the details here: http://0.0.0.0:6006/v1/traces to inspect the execution of your agent.
Run the agent
from smolagents import (
CodeAgent,
ToolCallingAgent,
ManagedAgent,
DuckDuckGoSearchTool,
VisitWebpageTool,
HfApiModel,
)
model = HfApiModel()
agent = ToolCallingAgent(
tools=(DuckDuckGoSearchTool(), VisitWebpageTool()),
model=model,
)
managed_agent = ManagedAgent(
agent=agent,
name="managed_agent",
description="This is an agent that can do web search.",
)
manager_agent = CodeAgent(
tools=(),
model=model,
managed_agents=(managed_agent),
)
manager_agent.run(
"If the US keeps its 2024 growth rate, how many years will it take for the GDP to double?"
)
This is what the logs will look like:
Conclusion
In conclusion, debugging ai agent runs can be complex due to their unpredictable workflows, extensive logs, and minor self-correcting errors. These challenges highlight the critical role of effective monitoring tools like OpenTelemetry, which provide the visibility and structure needed to optimize debugging, improve performance, and ensure agents are running smoothly. Try it for yourself and discover how OpenTelemetry can simplify the debugging and development process of your ai agent, making it easy to achieve smooth and reliable operations.
Explore The Agentic ai Pioneer program to deepen your understanding of Agent ai and unlock its full potential. Join us on this journey to discover innovative ideas and applications!