In this tutorial, we will build a powerful chatbot of questions about PDF based on PDF adapted for medical or health -related content. We will take advantage of the Flexible Data orchestration capabilities of Biomistral LLM and Langchain to process PDF documents into manageable text fragments. Then we will encode these fragments using facial embedded hugs, capture deep semantic relationships and store them in a chroma vector database for high efficiency recovery. Finally, when using a recovery generation (RAG) system (RAG), we will integrate the context directly recovered into the responses of our chatbot, ensuring clear and authorized responses for users. This approach allows us to quickly examine large volumes of medical PDFs, providing ideas rich in context, precise and easy to understand.
Tools configuration
!pip install langchain sentence-transformers chromadb llama-cpp-python langchain_community pypdf
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain.text_splitter import CharacterTextSplitter,RecursiveCharacterTextSplitter
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS, Chroma
from langchain_community.llms import LlamaCpp
from langchain.chains import RetrievalQA, LLMChain
import pathlib
import textwrap
from IPython.display import display
from IPython.display import Markdown
def to_markdown(text):
text = text.replace('•', ' *')
return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))
from google.colab import drive
drive.mount('/content/drive')
First, we install and configure the Python packages for document processing, the generation of inlays, the local LLMs and the advanced recovery workflows with LlamaCPP. We take advantage of Langchain_Community for the PDF load and text division, recovery configuration and LLMCHAIN to answer questions and include to_markDown Plus Google Drive utility.
API key access configuration
from google.colab import userdata
# Or use `os.getenv('HUGGINGFACEHUB_API_TOKEN')` to fetch an environment variable.
import os
from getpass import getpass
HF_API_KEY = userdata.get("HF_API_KEY")
os.environ("HF_API_KEY") = "HF_API_KEY"
Here, we look and establish the API key of the hugged face as an environment variable on Google Colab. You can also take advantage of the HuggingFacehub_api_token environment to avoid directly exposing confidential credentials in your code.
Loading and extracting PDF from a directory
loader = PyPDFDirectoryLoader('/content/drive/My Drive/Data')
docs = loader.load()
We use PYPDFDIRECTORYLOADER to scan the specified folder for PDFS, extract your text in a list of documents and establish the basis for tasks in response to questions, summary or keyword extraction.
Divide text documents loaded into manageable pieces
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
chunks = text_splitter.split_documents(docs)
In this fragment of code, recursivehacrtextsplitter is applied to break down each document in documents in smaller and more manageable segments.
Initializing the embeddidas of the hugged face
embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-base-en-v1.5")
Using HuggingFaceembeddings, we create an object using the BAAI/BGA-BASE-EN-V1.5 model. Convert text into numerical vectors.
Build a vector store and execute a search for similarity
vectorstore = Chroma.from_documents(chunks, embeddings)
query = "who is at risk of heart disease"
search = vectorstore.similarity_search(query)
to_markdown(search(0).page_content)
First we build a Chroma (Chroma.from_documes) vector store from the text fragments and the specified raw model. Next, you create a query asking, “who is at risk of heart disease” and performs a search for similarity against stored inlays. The upper result (search (0) .page_content) then becomes Markdown for a clearer screen.
Create a retriever and obtain relevant documents
retriever = vectorstore.as_retriever(
search_kwargs={'k': 5}
)
retriever.get_relevant_documents(query)
We convert Chroma's vector warehouse into a retriever (vectorstore.as_retriever) that efficiently obtains the most relevant documents for a given consultation.
Initialization of the Biomistral-7B model with flameacpp
llm = LlamaCpp(
model_path= "/content/drive/MyDrive/Model/BioMistral-7B.Q4_K_M.gguf",
temperature=0.3,
max_tokens=2048,
top_p=1)
We configure an open source LLM that uses flameacpp, aiming at a pre-discharged model file. We also configure generation parameters such as temperature, max_tokens and top_p, which control randomness, maximum tokens generated and the core sampling strategy.
Configuration of a generation chain (rag) recovery with a personalized message
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain.prompts import ChatPromptTemplate
template = """
<|context|>
You are an ai assistant that follows instruction extremely well.
Please be truthful and give direct answers
<|user|>
{query}
<|assistant|>
"""
prompt = ChatPromptTemplate.from_template(template)
rag_chain = (
{'context': retriever, 'query': RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
Using the above, we configure a RAG pipe using the Langchain frame. Create a personalized notice with instructions and position markers, incorporate a retriever for the context and take advantage of a language model to generate answers. The flow is defined as a series of operations (Runnablepastashrough for direct consultation management, chatprompttemplate for rapid construction, LLM for the generation of response and, finally, Stroutputparser to produce a clean text chain).
Invoke the rag chain to answer a health -related consultation
response = rag_chain.invoke("Why should I care about my heart health?")
to_markdown(response)
Now, we call the rag chain previously built with a user's consultation. The consultation to the retriever passes, recovers the relevant context of the collection of documents and feeds that context in the LLM to generate a concise and precise response.
In conclusion, by integrating biomistral through flamecpp and taking advantage of Langchain's flexibility, we can build a medical RAG chatbot with context awareness. From the fragment -based indexation to RAG without seams, it speeds the process of extracting large volumes of PDF data for relevant ideas. Users receive clear and easily legible answers by formating final answers in Markdown. This design can be extended or adapted for several domains, ensuring scalability and precision in the recovery of knowledge in various documents.
Use the Colab notebook here. Besides, don't forget to follow us <a target="_blank" href="https://x.com/intent/follow?screen_name=marktechpost” target=”_blank” rel=”noreferrer noopener”>twitter and join our Telegram channel and LINKEDIN GRsplash. Do not forget to join our 75K+ ml of submen.
Know Intellagent: A framework of multiple open source agents to evaluate a complex conversational system (Promoted)
Asif Razzaq is the CEO of Marktechpost Media Inc .. as a visionary entrepreneur and engineer, Asif undertakes to take advantage of the potential of artificial intelligence for the social good. Its most recent effort is the launch of an artificial intelligence media platform, Marktechpost, which stands out for its deep coverage of automatic learning and deep learning news that is technically solid and easily understandable by a broad audience. The platform has more than 2 million monthly views, illustrating its popularity among the public.