Introduction
Imagine having a personal assistant that not only understands your requests, but also knows exactly how to execute them – whether it’s performing a quick calculation or looking up the latest stock market news. In this article, we delve into the fascinating world of ai agents and explore how you can create your own using the LlamaIndex framework. We’ll walk you step-by-step through building these intelligent agents, highlighting the power of LLM’s function calling capabilities and demonstrating how they can make decisions and carry out tasks with impressive efficiency. Whether you’re new to ai or a seasoned developer, this guide will show you how to unlock the full potential of ai agents in just a few lines of code.
Learning outcomes
- Understand the basics of ai agents and their problem-solving capabilities.
- Learn how to implement ai agents using the LlamaIndex framework.
- Explore function calling features in LLM for efficient task execution.
- Learn how to integrate web search tools into your ai agents.
- Get hands-on experience building and customizing ai agents with Python.
This article was published as part of the Data Science Blogathon.
<h2 class="wp-block-heading" id="h-what-are-ai-agents”>What are ai agents?
ai agents are like digital assistants on steroids. They don’t just respond to your commands, they understand them, analyze them, and make decisions about how best to execute them. Whether it’s answering questions, performing calculations, or looking up the latest news, ai agents are designed to handle complex tasks with minimal human intervention. These agents can process natural language queries, identify key details, and use their skills to provide the most accurate and useful answers.
Why use ai agents?
The rise of ai agents is transforming the way we interact with technology. They can automate repetitive tasks, improve decision-making, and deliver personalized experiences, making them invaluable across a variety of industries. Whether you work in finance, healthcare, or e-commerce, ai agents can streamline operations, improve customer service, and provide detailed insights by handling tasks that would otherwise require significant manual effort.
What is LlamaIndex?
LlamaIndex is a cutting-edge framework designed to simplify the process of building ai agents using large language models (LLMs). It leverages the power of LLMs like OpenAI models, allowing developers to build intelligent agents with minimal coding. With LlamaIndex, you can plug in custom Python functions and the framework will automatically integrate them with the LLM, allowing your ai agent to perform a wide range of tasks.
Main features of LlamaIndex
- Function callLlamaIndex allows ai agents to invoke specific functions based on user queries. This feature is essential for creating agents that can handle multiple tasks.
- Tool integration:The framework supports the integration of various tools including web search, data analysis, and more, allowing your agent to perform complex operations.
- Ease of useLlamaIndex is designed to be easy to use, making it accessible to both beginners and experienced developers.
- PersonalizationWith support for custom functions and advanced features like pydantic models, LlamaIndex provides the flexibility needed for specialized applications.
<h2 class="wp-block-heading" id="h-steps-to-implement-ai-agents-using-llamaindex”>Steps to implement ai agents using LlamaIndex
Let us now look at the steps on how we can implement ai agents using LlamaIndex.
Here we will use OpenAI’s GPT-4o as our LLM model and web queries will be performed using Bing search. Llama Index already has integration with the Bing search tool and can be installed with this command.
!pip install llama-index-tools-bing-search
Step 1: Get the API key
First you need to create a Bing Search API Keywhich can be obtained by creating a Bing resource from the following link. To experiment, Bing also offers a free tier with 3 calls per second and 1000 calls per month.
Step 2: Install the necessary libraries
Install the required Python libraries using the following commands:
%%capture
!pip install llama_index llama-index-core llama-index-llms-openai
!pip install llama-index-tools-bing-search
Step 3: Set environment variables
Next, set your API keys as environment variables so that LlamaIndex can access them at runtime.
import os
os.environ("OPENAI_API_KEY") = "sk-proj-"
os.environ('BING_API_KEY') = ""
Step 4: Initialize the LLM
Initialize the LLM model (in this case, OpenAI's GPT-4o) and run a simple test to confirm that it works.
from llama_index.llms.openai import OpenAI
llm = OpenAI(model="gpt-4o")
llm.complete("1+1=")
Step 5: Create two different functions
Create two functions that your ai agent will use. The first function performs a simple sum, while the second retrieves the latest stock market news using Bing Search.
from llama_index.tools.bing_search import BingSearchToolSpec
def addition_tool(a:int, b:int) -> int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
For better function definition, we can also make use of Pydantic models. But for simplicity, here we will rely on LLM's ability to extract arguments from the user's query.
Step 6: Create a Function Tool Object from User-Defined Functions
from llama_index.core.tools import FunctionTool
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
A function tool allows users to easily convert any user-defined function into a tool object.
Here, the function name is the name of the tool and the documentation string will be treated as the description, but this can also be overridden as shown below.
tool = FunctionTool.from_defaults(addition_tool, name="...", description="...")
Step 7: Call the predict_and_call method with the user's query
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=(add_tool, search_tool),
user_msg=query, verbose = True
)
Here we will call the predict_and_call method of llm along with the user query and the tools we defined above. The tool arguments can take more than one function by putting all the functions inside a list. The method will go through the user query and decide which tool is the most suitable to perform the given task from the list of tools.
Output example
=== Calling Function ===
Calling function: web_search_tool with args: {"query": "current market price of Apple stock"}
=== Function Output ===
(('Warren Buffett Just Sold a Huge Chunk of Apple Stock. Should You Do the Same?', ..........
Step 8: Put it all together
from llama_index.llms.openai import OpenAI
from llama_index.tools.bing_search import BingSearchToolSpec
from llama_index.core.tools import FunctionTool
llm = OpenAI(model="gpt-4o")
def addition_tool(a:int, b:int)->int:
"""Returns sum of inputs"""
return a + b
def web_search_tool(query:str) -> str:
"""A web query tool to retrieve latest stock news"""
bing_tool = BingSearchToolSpec(api_key=os.getenv('BING_API_KEY'))
response = bing_tool.bing_news_search(query=query)
return response
add_tool = FunctionTool.from_defaults(fn=addition_tool)
search_tool = FunctionTool.from_defaults(fn=web_search_tool)
query = "what is the current market price of apple"
response = llm.predict_and_call(
tools=(add_tool, search_tool),
user_msg=query, verbose = True
)
Advanced customization
For those looking to push the boundaries of what ai agents can do, advanced customization offers the tools and techniques to refine and expand their capabilities, allowing your agent to handle more complex tasks and deliver even more accurate results.
Definitions of improvement functions
To improve how the ai agent interprets and uses functions, you can incorporate Pydantic models. This adds type checking and validation, ensuring that the agent processes inputs correctly.
Handling complex queries
For more complex user queries, consider building additional tools or refining existing ones to handle multiple tasks or more complex requests. This could involve adding error handling, logging, or even custom logic to manage how the agent responds to different scenarios.
Conclusion
ai agents can process user inputs, reason about the best approach, access relevant knowledge, and execute actions to provide accurate and useful answers. They can extract parameters specified in the user query and pass them to the corresponding function to carry out the task. With LLM frameworks like LlamaIndex, Langchain, etc., one can easily implement agents with a few lines of code and also customize things like function definitions using pydantic models.
Key points
- Agents can take multiple independent functions and determine which function to execute based on the user's query.
- With the function call, LLM will decide the best function to complete the task based on the function name and description.
- The function name and description can be overridden by explicitly specifying the function name and description parameter when creating the tool object.
- Llamaindex has incorporated tools and techniques to implement ai agents in just a few lines of code.
- It is also worth noting that function call agents can only be implemented using LLMs that support function calling.
Frequently Asked Questions
A. An ai agent is a digital assistant that processes user queries, determines the best approach, and executes tasks to provide accurate answers.
A. LlamaIndex is a popular framework that allows easy deployment of ai agents using LLMs, such as OpenAI models.
A. Function call allows the ai agent to select the most suitable function based on the user query, making the process more efficient.
A. You can integrate web search using tools like BingSearchToolSpec, which retrieves data in real time based on queries.
The media displayed in this article is not the property of Analytics Vidhya and is used at the discretion of the author.