The first step is collect and upload your data — For this example, you will use President Biden’s 2022 State of the Union Address as additional context. The plain text document is available at ai/langchain/master/docs/docs/modules/state_of_the_union.txt” rel=”noopener ugc nofollow” target=”_blank”>LangChain GitHub Repository. To load the data, you can use one of LangChain’s many built-ins. DocumentLoader
s. TO Document
It is a dictionary with text and metadata. To load text, you will use LangChain TextLoader
.
import requests
from langchain.document_loaders import TextLoaderurl = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
f.write(res.text)
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
Next, fragment your documents – Because the Document
, in its original state, is too long to fit in the LLM context window, it needs to be divided into smaller parts. LangChain comes with many built-ins text dividers for this purpose. For this simple example, you can use the CharacterTextSplitter
with a chunk_size
of about 500 and a chunk_overlap
of 50 to preserve the continuity of the text between the fragments.
from langchain.text_splitter import CharacterTextSplitter
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
Finally, embed and store the chunks — To enable semantic search on text fragments, you must generate vector embeddings for each fragment and then store them along with your embeddings. To generate vector embeddings you can use the OpenAI embedding model and to store them you can use the Weaviate vector database. Calling .from_documents()
the vector database is automatically populated with the fragments.
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Weaviate
import weaviate
from weaviate.embedded import EmbeddedOptionsclient = weaviate.Client(
embedded_options = EmbeddedOptions()
)
vectorstore = Weaviate.from_documents(
client = client,
documents = chunks,
embedding = OpenAIEmbeddings(),
by_text = False
)
Step 1: recover
Once the vector database is complete, you can define it as the retriever component, which retrieves additional context based on the semantic similarity between the user’s query and the embedded fragments.
retriever = vectorstore.as_retriever()
Step 2: increase
Next, to augment the message with additional context, you must prepare a message template. The message can be easily customized from a message template, as shown below.
from langchain.prompts import ChatPromptTemplatetemplate = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)
print(prompt)
Step 3: Generate
Finally, you can build a chain for the RAG pipeline, chaining the fetcher, the prompt template, and the LLM. Once the RAG chain is defined, you can invoke it.
from langchain.chat_models import ChatOpenAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParserllm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
query = "What did the president say about Justice Breyer"
rag_chain.invoke(query)
"The president thanked Justice Breyer for his service and acknowledged his dedication to serving the country.
The president also mentioned that he nominated Judge Ketanji Brown Jackson as a successor to continue Justice Breyer's legacy of excellence."
You can see the resulting RAG pipeline for this specific example illustrated below:
This article covered the concept of RAG, which was introduced in the article. Augmented Retrieval Generation for Knowledge-Intensive NLP Tasks (1) from 2020. After covering some theory behind the concept, including motivation and troubleshooting, this article turned its implementation into Python. This article implemented a RAG pipeline using a Open ai LLM in combination with a Weaviate vector database and a Open ai embedding model. LangChain It was used for orchestration.