Exploring robust RAG development with LlamaPacks, Lighthouz ai and Llama Guard
Since launching in late November 2023, LlamaPacks has curated over 50 packs to help drive the development of its RAG channel. Among them, many advanced recovery packages emerged. In this article, let's delve into seven advanced recovery packages; see the diagram below.
We will carry out two steps:
- Given a use case, we will generate the benchmarks using Lighthouz AutoBench and evaluate the packages with Lighthouz Eval Studio to determine which best fits our use case.
- Once the winning package is identified, we will add Llama Guard to the RAG process, modify its custom taxonomy, re-evaluate it with Eval Studio, and watch how the evaluation score changes for categories like rapid injection.
First, let's look at these seven advanced recovery LlamaPacks to see how they work under the hood.
Hybrid fusion
This package brings together the vector retrievers and the BM25 (Best Match 25) retrievers via fusion. BM25 estimates the relevance of documents to a given search query, helping to rank documents in order of most likely relevance to the user's needs.
Hybrid Fusion merges the results of the vector retriever and the BM25 retriever out of the box; You can provide other Retriever templates you want by customizing this package.
documents = SimpleDirectoryReader(RAG_DIRECTORY).load_data()
node_parser = SimpleNodeParser.from_defaults()
nodes = node_parser.get_nodes_from_documents(documents)# download and install dependencies
HybridFusionRetrieverPack = download_llama_pack(
"HybridFusionRetrieverPack", "./hybrid_fusion_pack"
)
# create the pack
hybrid_fusion_pack = HybridFusionRetrieverPack(
nodes,
chunk_size=256,
vector_similarity_top_k=2,
bm25_similarity_top_k=2
)