Essential metrics and methods to improve performance in recovery, generation, and end-to-end pipelines.
Introduction
When we think about some of the most common applications of generative ai, recovery augmented generation (RAG) has certainly become one of the most common topics of discussion within this domain. Unlike traditional search engines that relied on optimizing retrieval mechanisms using keyword searches to find information relevant to a given query, RAG goes a step further by generating a complete answer for a given question using the retrieved content. .
The following figure illustrates a graphical representation of RAG in which documents of interest are encoded using an embedding model and then indexed and stored in a vector store. When a query is submitted, it is typically integrated in a similar manner, followed by two steps (1) the retrieval step that searches for similar documents and then (2) a generative step that uses the retrieved content to synthesize a response.