Vector Databases in AI and LLM Use Cases

<img decoding="async" alt="Vector Databases in ai and LLM Use Cases” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2024/04/Vector-Databases-in-AI-and-LLM-Use-Cases.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2024/04/Vector-Databases-in-AI-and-LLM-Use-Cases.png" alt="Vector Databases in ai and LLM Use Cases” width=”100%”/>
Image generated with ai” rel=”noopener” target=”_blank”>Ideogram.ai

So you might hear all these Vector Database terms. Some can understand it and some can't. Don't worry if you don't know about them, as vector databases have become a more prominent topic in recent years.

Vector databases have gained popularity thanks to the introduction of generative ai to the public, especially LLM.

Many LLM products, such as GPT-4 and Gemini, help our work by providing text generation capability for our input. Well, vector databases actually play a role in these LLM products.

But how did Vector Database work? And what are its relevance in the LLM?

The previous question is what we would answer in this article. Well, let's explore them together.

A vector database is a specialized database storage designed to store, index, and query vector data. It is often optimized for high-dimensional vector data as it is usually the result of machine learning model, especially LLM.

In the context of a vector database, the vector is a mathematical representation of the data. Each vector consists of a series of numerical points that represent the position of the data. Vector is often used in LLM to represent text data, since a vector is easier to process than text data.

In LLM space, the model could have a text input and could transform the text into a high-dimensional vector representing the semantic and syntactic features of the text. This process is what we call Embedding. In simpler terms, embedding is a process that transforms text into vectors with numerical data.

Embedding generally uses a neural network model called an embedding model to represent text in the embedding space.

Let's use an example text: “I love data science.” Representing them with the OpenAI text-embedding-3-small model would result in a vector with 1536 dimensions.

(0.024739108979701996, -0.04105354845523834, 0.006121257785707712, -0.02210472710430622, 0.029098540544509888,...)

The number inside the vector is the coordinate within the model embedding space. Together, they would form a single representation of the meaning of the sentence coming from the model.

Vector Database would then be responsible for storing the results of the embedding model. The user could then query, index, and retrieve the vector as needed.

Maybe this is enough introduction and let's get into a more technical practice. We would try to set and store vectors with an open source vector database called Weaviate.

Weaviate is an open source scalable vector database that serves as a framework to store our vector. We can run Weaviate on instances like Docker or use Weaviate Cloud Services (WCS).

To start using Weaviate, we need to install the packages using the following code:

pip install weaviate-client

To make things easier, we would use a WCS sandbox cluster to act as our vector database. Weaviate provides a free 14-day cluster that we can use to store our vectors without registering any payment methods. To do that, you must register on your WCS console initially.

Once inside the WCS platform, select Create a cluster and enter the name of your Sandbox. The user interface should look like the image below.

<img decoding="async" alt="Vector Databases in ai and LLM Use Cases” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2024/04/1713568455_176_Vector-Databases-in-AI-and-LLM-Use-Cases.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2024/04/1713568455_176_Vector-Databases-in-AI-and-LLM-Use-Cases.png" alt="Vector Databases in ai and LLM Use Cases” width=”100%”/>
Image by author

Don't forget to enable authentication as we also want to access this cluster via the WCS API key. Once the cluster is ready, find the API key and the cluster URL, which we will use to access the vector database.

Once everything is ready, we will simulate storing our first vector in the vector database.

For the vector database storage example, you would use the Book collection Kaggle example dataset. I would only use the 100 rows and 3 main columns (title, description, introduction).

import pandas as pd
data = pd.read_csv('commonlit_texts.csv', nrows = 100, usecols=('title', 'description', 'intro'))

Let's put our data aside and connect to our vector database. First, we need to configure a remote connection using the API key and URL of your cluster.

import weaviate
import os
import requests
import json


cluster_url = "Your Cluster URL"
wcs_api_key = "Your WCS API Key"
Openai_api_key ="Your OpenAI API Key"

client = weaviate.connect_to_wcs(
    cluster_url=cluster_url,
    auth_credentials=weaviate.auth.AuthApiKey(wcs_api_key),
    headers={
        "x-OpenAI-Api-Key": openai_api_key
    }
)

Once you set your client variable, we will connect to the Weaviate cloud service and create a class to store the vector. The class in Weaviate is the collection of data or analogues of the table name in a relational database.

import weaviate.classes as wvc

client.connect()
book_collection = client.collections.create(
    name="BookCollection",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),  
    generative_config=wvc.config.Configure.Generative.openai()  
)

In the code above, we connect to the Weaviate Cluster and create a BookCollection class. The class object also uses the OpenAI text2vec embedding model to vectorize the text data and the OpenAI generative module.

Let's try to store the text data in a vector database. To do that you can use the following code.

sent_to_vdb = data.to_dict(orient="records")
book_collection.data.insert_many(sent_to_vdb)

<img decoding="async" alt="Vector Databases in ai and LLM Use Cases” width=”100%” src=”https://technicalterrence.com/wp-content/uploads/2024/04/1713568455_159_Vector-Databases-in-AI-and-LLM-Use-Cases.png”/><img decoding="async" src="https://technicalterrence.com/wp-content/uploads/2024/04/1713568455_159_Vector-Databases-in-AI-and-LLM-Use-Cases.png" alt="Vector Databases in ai and LLM Use Cases” width=”100%”/>
Image by author

We have just successfully stored our data set in the vector database! How easy is that?

Now, you might be curious about the use cases for vector databases with LLM. That is what we are going to discuss next.

Some use cases in which LLM can be applied with Vector Database. Let's explore them together.

Semantic Search

Semantic search is a process of searching data using the meaning of the query to retrieve relevant results instead of relying solely on traditional keyword-based search.

The process involves using the query embedding LLM model and performing an embedding similarity search on our stored embedded vector database.

Let's try using Weaviate to perform a semantic search based on a specific query.

book_collection = client.collections.get("BookCollection")

client.connect()
response = book_collection.query.near_text(
      query="childhood story,
      limit=2
  )

In the code above, we attempted to perform a semantic search with Weaviate to find the top two books closely related to the query's childhood history. Semantic search uses the OpenAI embedding model we set up earlier. The result is what you can see below.

{'title': 'Act Your Age', 'description': 'A young girl is told over and over again to act her age.', 'intro': 'Colleen Archer has written for \nHighlights\n. In this short story, a young girl is told over and over again to act her age.\nAs you read, take notes on what Frances is doing when she is told to act her age. '}

{'title': 'The Anklet', 'description': 'A young woman must deal with unkind and spiteful treatment from her two older sisters.', 'intro': "Neil Philip is a writer and poet who has retold the best-known stories from \nThe Arabian Nights\n for a modern day audience. \nThe Arabian Nights\n is the English-language nickname frequently given to \nOne Thousand and One Arabian Nights\n, a collection of folk tales written and collected in the Middle East during the Islamic Golden Age of the 8th to 13th centuries. In this tale, a poor young woman must deal with mistreatment by members of her own family.\nAs you read, take notes on the youngest sister's actions and feelings."}

As you can see, in the result above there are no direct words about childhood stories. However, the result is still closely related to a story aimed at children.

Generative search

Generative Search could be defined as an extension application of Semantic Search. Generative search, or retrieval augmented generation (RAG), uses LLM cues with semantic search that retrieves data from the vector database.

With RAG, the query search result is processed in LLM, so we get them in the form we want instead of the raw data. Let's try a simple RAG implementation with Vector Database.

response = book_collection.generate.near_text(
    query="childhood story",
    limit=2,
    grouped_task="Write a short LinkedIn post about these books."
)

print(response.generated)

The result can be seen in the following text.

Excited to share two captivating short stories that explore themes of age and mistreatment. "Act Your Age" by Colleen Archer follows a young girl who is constantly told to act her age, while "The Anklet" by Neil Philip delves into the unkind treatment faced by a young woman from her older sisters. These thought-provoking tales will leave you reflecting on societal expectations and family dynamics. #ShortStories #Literature #BookRecommendations 📚

As you can see, the data content is the same as before, but now it has been processed with OpenAI LLM to provide a short LinkedIn post. In this way, RAG is useful when we want a specific output from the data.

Answer questions with RAG

In our example above, we used a query to get the data we wanted and RAG processed that data into the desired result.

However, we can turn the RAG capability into a tool for answering questions. We can achieve this by combining them with the LangChain framework.

First, let's install the necessary packages.

pip install langchain 
pip install langchain_community 
pip install langchain_openai

Next, let's try to import the packages and initialize the variables we need to make QA with RAG work.

from langchain.chains import RetrievalQA
from langchain_community.vectorstores import Weaviate
import weaviate
from langchain_openai import OpenAIEmbeddings
from langchain_openai.llms.base import OpenAI

llm = OpenAI(openai_api_key = openai_api_key, model_name="gpt-3.5-turbo-instruct", temperature = 1)

embeddings = OpenAIEmbeddings(openai_api_key = openai_api_key )

client = weaviate.Client(
    url=cluster_url, auth_client_secret=weaviate.AuthApiKey(wcs_api_key)
)

In the code above, we configure the LLM for text generation, the embedding model, and the Weaviate client connection.

Next, we configure Weaviate's connection to the vector database.

weaviate_vectorstore = Weaviate(client=client, index_name="BookCollection", text_key='intro',by_text = False, embedding=embeddings)
retriever = weaviate_vectorstore.as_retriever()

In the code above, make the Weaviate Database BookCollection the RAG tool that will look for the 'intro' function when prompted.

Then, we would create a question answer chain from LangChain with the following code.

qa_chain = RetrievalQA.from_chain_type(
    llm=llm, chain_type="stuff", retriever = retriever
)

Everything is ready now. Let's try QA with RAG using the following code example.

response = qa_chain.invoke(
    "Who is the writer who write about love between two goldfish?")
print(response)

The result is shown in the following text.

{'query': 'Who is the writer who write about love between two goldfish?', 'result': ' The writer is Grace Chua.'}

With Vector Database as a place to store all text data, we can implement RAG to perform quality control with LangChain. How cool is that?

A vector database is a specialized storage solution designed to store, index, and query vector data. It is often used to store text data and is implemented in conjunction with large language models (LLM). This article will test a practical configuration of Vector Database Weaviate, including example use cases such as semantic search, retrieval augmented generation (RAG), and question answering with RAG.

Cornellius Yudha Wijaya He is an assistant data science manager and data writer. While working full-time at Allianz Indonesia, he loves sharing data and Python tips through social media and print media. Cornellius writes on a variety of artificial intelligence and machine learning topics.