Retrieval augmented generation (RAG) has revolutionized question answering in the open domain, allowing systems to produce human-like answers to a wide range of queries. At the heart of RAG is a retrieval module that scans a vast corpus to find relevant context passages, which are then processed by a neural generative module (often a pre-trained language model such as GPT-3) to formulate a final answer.
While this approach has been very effective, it is not without limitations.
One of the most critical components, vector search on embedded passages, has inherent limitations that can hinder the system’s ability to reason in a nuanced way. This is particularly evident when questions require complex, multi-hop reasoning across multiple documents.
Vector search refers to the search for information using vector representations of data. It involves two key steps:
- Encode data into vectors
First, the data being sought is encoded into numerical vector representations. For text data such as passages or documents, this is done using embedding models such as BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that represent semantic meaning. Images, audio and other formats can also be encoded into vectors using appropriate deep learning models.
2. Vector Similarity Search
Once the data is encoded into vectors, searching involves finding vectors similar to the vector representation of the search query. This relies on distance metrics, such as cosine similarity, to quantify how close two vectors are and classify the results. The vectors with the smallest distance (highest similarity) are returned as the most relevant search results.
The key advantage of vector search is the ability to search for semantic similarities, not just literal keyword matches. Vector representations capture conceptual meaning, allowing the identification of more relevant but linguistically distinct results. This allows for higher search quality compared to traditional keyword matching.
However, transforming data into vectors and searching in a high-dimensional semantic space also has limitations. Balancing the advantages and disadvantages of vector search is an active area of research.
In this article, we’ll look at the limitations of vector search and explore why it has difficulty…