RAG is an abbreviation for recovery augmented generation. Let's break down this term to get a clear description of what RAG is:
R -> Recovery
A -> Increased
G -> Generation
Basically, the LLM we use today is not up to date. If I ask a question to an LLM, say ChatGPT, they might hallucinate and give us a wrong answer. To overcome this situation, we empower our LLM with more data (data that can only be accessed by limited people, not globally). We then ask the trained LLM some questions with that data. It will surely give us the relevant information. These are some situations that can occur if we do not use RAG:
- Greater chance of hallucination
- LLM is outdated
- Factual and reduced precision information
You can take a look at the diagram mentioned below:
RAG is a hybrid system that combines the strength of a retrieval-based system with LLM to generate more accurate, relevant and informed decisions. This method leverages external knowledge sources during the generation process, improving the model's ability to provide up-to-date and contextually appropriate information. In the diagram above:
- In the first step, the user makes the query to the LLM.
- The query is then sent to
- He
- The retrieved documents, along with the original query, are sent to the language model (LLM).
- The generator processes both the query and the relevant documents to generate a response, which is then sent back to the user.
Now I know that you are completely interested in learning RAG from basic to advanced. Now let me tell you the perfect roadmap to learn RAG in just 5 days. Yes, you heard right, in just 5 days you can learn the RAG system. Let's go directly to the roadmap:
Day 1: Build a base for RAG
The main objective of day 1 is to understand the RAG at a high level and explore what its key components are. Below is the breakdown of Day 1 topics.
RAG Overview:
- Recognize the functions, importance and place of RAG in contemporary NLP.
- The main idea is that retrieval-augmented generation improves generative models by incorporating external information.
Key components:
- Learn about recovery and generation separately.
- Discuss architectures for both recovery (e.g., dense passage recovery (DPR), BM25) and generation (e.g., GPT, BART, T5).
Day 2: Building your own recovery system
The main goal of day 2 is to successfully implement a recovery system (even a basic one). The topics for day 2 are broken down below.
Deepening recovery models:
- Learn more about dense recovery vs. sparse recovery:
- Dense: RPD, ColBERT.
- Scarce: BM25, TF-IDF.
- Discover the advantages and disadvantages of each method.
Recovery implementation:
- Use libraries such as elasticsearch for sparse recovery or faiss for dense recovery to perform basic recovery tasks.
- Follow Hugging Face's DPR tutorial to understand how to retrieve relevant documents from a knowledge base.
Knowledge databases:
- Understand how knowledge bases are structured.
- Learn how to prepare data for retrieval tasks, such as corpus preprocessing and document indexing.
Day 3: Fine-tune a generative model and see the results
The goal of day 3 is to fine-tune a generative model and observe the results. Understand the role of recovery in increasing generation. Below is the breakdown of the topics for day 3.
Deepening into generative models:
- Examine trained models such as T5, GPT-2, and BART.
- Learn the tuning process for generation tasks, such as answering questions or summarizing.
Practice with generative models:
- Apply the transformers provided by Hugging Face to refine a model on a short data set.
- Test generating question answers using the generative model.
Exploring the interaction between recovery and generation:
- Examine the generative model input methods for the retrieved data.
- Recognize how retrieval improves the precision and caliber of the responses that are generated.
Day 4: Implement a working RAG system
Now we are getting closer to the goal. The main goal of this day is to implement a working RAG system on a simple data set and get familiar with parameter tuning. The topics for day 4 are broken down below.
Combining recovery and generation:
- Combine generation and recovery components into a single system.
- Implement the interaction between the recovery results and the generative model.
Using Llamaindex RAG channel:
- Check out the official documentation or a tutorial to learn how the RAG pipeline works.
- Using the LlamaIndex RAG model, set up and run an example.
Practical experimentation:
- Start experimenting with different parameters, such as the number of documents retrieved, beam search strategies for generation, and temperature scaling.
- Try running the model on simple tasks that require a lot of knowledge.
Day 5: Build and tune a more robust RAG system
The goal of this final day is to create a more robust RAG model by fine-tuning it and gain insight into the different types of RAG models you can explore. Below is the breakdown of the topics for day 5.
- Advanced setting: Examine how to optimize the generation and recovery components for specific tasks in a given domain.
- Extension: Use larger data sets and more complex knowledge bases to increase the size of your RAG system.
- Performance Optimization: Learn how to maximize memory consumption and fetch speed (e.g. using faiss with GPU).
- Assessment: Gain the skills to evaluate RAG models in knowledge-intensive jobs. using various BLEU metrics, ROUGE and more measures to address questions.
Final note
If you follow this roadmap, you will be able to learn the RAG system in 5 days, depending on your learning capabilities. I hope you like this roadmap. I usually share generative ai stuff in the form of a carousel or rather a bite-sized informative post. You can check more carousels in my Linkedin profile.
If you're looking to build your RAG from scratch, tune into our FREE course on how to build a RAG system using LlamaIndex!