The basic Retrieval-Augmented Generation (RAG) sequence uses an encoder model to find similar documents when queried.
This is also called semantic search because the encoder transforms the text into a high-dimensional vector representation (called an embedding) in which semantically similar texts are close to each other.
Before we had large language models (LLMs) to create these vector embeddings, the BM25 algorithm was a very popular search algorithm. BM25 focuses on important keywords and searches for exact matches in the available documents. This approach is called Search by keywords.
If you want to take your RAG pipeline to the next level, you might want to try hybrid searchHybrid search combines the benefits of keyword search and semantic search to improve search quality.
In this article, we will cover the theory and implement all three search approaches in Python.
table of Contents
· RAG Recovery
∘ Keyword research with BM25
∘ Semantic search with dense embeddings
∘ Semantic search or hybrid search?
∘ Hybrid search
∘ Putting it all together
·…