I have been reading a lot about RAG and ai agents, but with the launch of new models such as Deepseek V3 and Deepseek R1, it seems that the possibility of building efficient rag systems has improved significantly, offering a better recovery accuracy, reasoning capabilities, reasoning capabilities Improved and improvements and improvements and improvements and improvements and improvements of reasoning and improvements more scalable architectures for real world applications. The integration of more sophisticated recovery mechanisms, improved fine adjustment options and multimodal capabilities are changing the way in which ai agents interact with data. Ask questions about whether Rag's traditional approaches remain the best way to follow or if newer architectures can provide more efficient and contextually aware solutions.
The recovery generation (RAG) systems have revolutionized the way in which the IA models interact with the data by combining recovery -based approaches and generative to produce more precise and conscious responses of the context. With the advent of Deepseek R1An open source model known for its efficiency and profitability, building an effective cloth system has become more accessible and practical. In this article, we are building a rag system using Deepseek R1.
What is Deepseek R1?
<a target="_blank" href="https://github.com/deepseek-ai/DeepSeek-R1″ target=”_blank” rel=”noreferrer noopener nofollow”>Deepseek R1 It is an open source ai model with the objective of providing high quality reasoning and recovery capabilities to a fraction of the cost of patented models such as OpenAI offers. It has a MIT license, which makes it commercially viable and adequate for a wide range of applications. In addition, this powerful model allows you to see the cradle, but the OPENAI O1 and O1-mini show no reasoning token.
Knowing how deep R1 is challenging the Operai O1 model: Deepseek R1 vs OpenAi O1: Which is faster, cheaper and smarter?
Benefits of the use of Deepseek R1 for the RAG system
Building a recovery generation system (RAG) that uses Deepseek-R1 offers several notable advantages:
1. Advanced reasoning capabilities: Deepseek-R1 is designed to emulate human reasoning through the analysis and processing of step-by-step information before reaching conclusions. This approach improves the ability of the system to handle complex consultations, particularly in areas that require logical inference, mathematical reasoning and coding tasks.
2. Open source accessibility: MIT license, Depseek-R1 is open source, allowing developers without restrictions on their model. This opening facilitates customization, fine adjustment and integration into various applications without limitations often associated with proprietary models.
3. Competitive performance: The reference tests indicate that Deepseek-R1 works alongside, or even exceeds, leading models such as Openi's O1 in tasks that involve reasoning, mathematics and coding. This level of performance guarantees that a RAG system built with Depseek-R1 can offer precise responses of high quality in diverse and challenging consultations.
4. Transparency in the thought process: Deepseek-R1 uses a “chain of thought” methodology, making its reasoning steps visible during inference. This transparency not only helps purification and refining the system, but also generates user confidence by providing clear information on how the conclusions are reached.
5. Condictivity: The Open Source Nature of Deepseek-R1 eliminates license rates, and its efficient architecture reduces computer resources requirements. These factors contribute to a more profitable solution for organizations that seek to implement sophisticated RAG systems without incurring significant expenses.
Deepseek-R1 integration into a RAG system provides a powerful combination of advanced reasoning, transparency, performance and profitability skills, which makes it a convincing option for developers and organizations with the aim of improving their ai capacities.
Steps to build a rag system using Depseek R1
The script is a recovery generation pipe (rag) that:
- Load and process a PDF document dividing it into pages and extracting text.
- Vectorized representations of the text in a database (Chromadb).
- Recover relevant content Using a search for similarity when a query is requested.
- Use a LLM (Deepseek model) Generate responses based on the recovered text.
Install previous requirements
curl -fsSL https://ollama.com/install.sh | sh
After this, pull Depseek R1: 1.5b using:
ollama pull deepseek-r1:1.5b
This will take a moment to download:
ollama pull deepseek-r1:1.5b
pulling manifest
pulling aabd4debf0c8... 100% ▕████████████████▏ 1.1 GB
pulling 369ca498f347... 100% ▕████████████████▏ 387 B
pulling 6e4c38e1172f... 100% ▕████████████████▏ 1.1 KB
pulling f4d24e9138dd... 100% ▕████████████████▏ 148 B
pulling a85fe2a2e58e... 100% ▕████████████████▏ 487 B
verifying sha256 digest
writing manifest
success
After doing this, open your Jupyter notebook and start with the coding part:
1. Install dependencies
Before executing, the script installs the required Python libraries:
langchain
→ A framework for the creation of applications using large language models (LLM).langchain-openai
→ Provides integration with OpenAi services.langchain-community
→ Add support for several documents and utilities loaders.langchain-chroma
→ enables integration with Chromadb, a vector database.
2. Enter the API OpenI key
To access the OpenAI embedding model, the script asks the user to Surely enter your API key wearing getpass()
. This avoids exposing credentials in flat text.
3. Configure environment variables
The script Store the API key as an environment variable. This allows other parts of the code to access Operai services No hard coding credentialsthat improves security.
4. Initialize OpenAI inlays
The script initializes an OpenAi inlaid model called "text-embedding-3-small"
. This model Converts the text into vector embeddeswhich are numerical representations of high dimension of the meaning of the text. These inlays are later used to Compare and recover similar content.
5. Load and divide a PDF document
A PDF file (AgenticAI.pdf
) is loaded and divided into pages. The text is extracted from each page, which allows smaller and more manageable pieces of text instead of processing the entire document as a single unit.
6. Create and store a vector database
- The text extracted from PDF is converted into embeddidas vector.
- These inlays are stored in ChromadbHigh performance Vector database.
- The database is configured to use similarity of cosinewhich ensures that the text with a high degree of semantic similarity is recovered efficiently.
7. Recover similar texts using a similarity threshold
TO retriever It is created using Chromadb, which:
- Search for the 3 main more similar Documents based on a given consultation.
- Filt the results with a similarity threshold of 0.3 (that is, the documents must have at least 30% similarity to be considered relevant).
8. Consultation for similar documents
Two test queries are used:
"What is the old capital of India?"
- No results were foundwhich indicates that stored documents do not contain relevant information.
"What is Agentic ai?"
- Relevant text successfully recoversdemonstrating that the system can obtain a significant context.
9. Build a rag chain (augada recovery generation)
The script establishes a Channelwhich ensures that:
- Text recovery It happens before generating an answer.
- The model's response is strictly based on recovered contentHallucinations prevention.
- TO template immediately It is used to instruct the model that generates structured responses.
10. Load a connection to a LLM (Deepseek model)
Instead of El Openai GPT, the script Deepseek-R1 load (1.5b parameters)A powerful LLM optimized for recovery -based tasks.
11. Create a rag -based chain
Langchain's Recovery The module is used for:
- Obtain relevant content of the vector database.
- Format a structured response Using an immediate template.
- Take a concise response With the Deepseek model.
12. Try the rag chain
The script executes a test consultation:"Tell the Leaders’ Perspectives on Agentic ai"
He LLM generates a fact -based response strictly using the recovered context.
The system Recover relevant information from the database.
Code to build a rag system using Deepseek R1
Here is the code:
Install OpenAI and Langchain units
!pip install langchain==0.3.11
!pip install langchain-openai==0.2.12
!pip install langchain-community==0.3.11
!pip install langchain-chroma==0.1.4
<h3 class="wp-block-heading" id="h-enter-open-ai-api-key”>Enter the ai Open API key
from getpass import getpass
OPENAI_KEY = getpass('Enter Open ai API Key: ')
Setting environment variables
import os
os.environ('OPENAI_API_KEY') = OPENAI_KEY
<h3 class="wp-block-heading" id="h-open-ai-embedding-models”>Open models of inlaint of ai
from langchain_openai import OpenAIEmbeddings
openai_embed_model = OpenAIEmbeddings(model="text-embedding-3-small")
Create a Vector DB and persist on the disc
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader('AgenticAI.pdf')
pages = loader.load_and_split()
texts = (doc.page_content for doc in pages)
from langchain_chroma import Chroma
chroma_db = Chroma.from_texts(
texts=texts,
collection_name="db_docs",
collection_metadata={"hnsw:space": "cosine"}, # Set distance function to cosine
embedding=openai_embed_model
)
Similarity with threshold recovery
similarity_threshold_retriever = chroma_db.as_retriever(search_type="similarity_score_threshold",search_kwargs={"k": 3,"score_threshold": 0.3})
query = "what is the old capital of India?"
top3_docs = similarity_threshold_retriever.invoke(query)
top3_docs
()
query = "What is Agentic ai?"
top3_docs = similarity_threshold_retriever.invoke(query)
top3_docs

Build a rag chain
from langchain_core.prompts import ChatPromptTemplate
prompt = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If no context is present or if you don't know the answer, just say that you don't know.
Do not make up the answer unless it is there in the provided context.
Keep the answer concise and to the point with regard to the question.
Question:
{question}
Context:
{context}
Answer:
"""
prompt_template = ChatPromptTemplate.from_template(prompt)
Load connection to LLM
from langchain_community.llms import Ollama
deepseek = Ollama(model="deepseek-r1:1.5b")
Langchain syntax for the rag chain
from langchain.chains import Retrieval
rag_chain = Retrieval.from_chain_type(llm=deepseek,
chain_type="stuff",
retriever=similarity_threshold_retriever,
chain_type_kwargs={"prompt": prompt_template})
query = "Tell the Leaders’ Perspectives on Agentic ai"
rag_chain.invoke(query)
{'query': 'Tell the Leaders’ Perspectives on Agentic ai',

See our detailed articles on Depseek's work and comparison with similar models:
Conclusion
Build a rag system using Deepseek R1 It provides a profitable and powerful way to improve documents recovery and response generation. With its open source nature and its strong reasoning capabilities, it is a great alternative to patented solutions. Companies and developers can take advantage of their flexibility to create applications driven by ai adapted to their needs.
Do you want to create applications using Depseek? See ours Free Depseek course today!