Last month, ai Bloks announced the open source release of its development framework, llmware, for creating enterprise-grade LLM-based workflow applications. Today, ai Bloks takes another big step on the path to delivering a next-generation RAG framework with the launch of the DRAGON (Delivering RAG on…) series of 7B parameter LLMs, designed for enterprise and lean workflows. with the specific goal of Answers to fact-based questions for complex business and legal documents.
As more companies look to implement scalable RAG systems using their own private information, several needs are increasingly recognized:
- Unified framework that integrates LLM models with a set of surrounding workflow capabilities (e.g. document analysis, embedding, quick management, source verification, audit trail);
- Specialized, smaller, high-quality LLMs that have been optimized for fact-based question answering and business workflows and
- Open source, cost-effective, private implementation with flexibility and customization options.
To satisfy these needs, ai/llmware” target=”_blank” rel=”noreferrer noopener”>LLMware is throwing seven CONTINUE models available in open source in your Hugging Face Repositoryall of which have been extensively tuned for RAG and built on leading base models with solid production-grade preparation for RAG enterprise workflows.
All the CONTINUE The models have been evaluated using the llmware rag-instruct-benchmark with full test results and methodology provided with the models in the repository. Each of the DRAGON models achieves an accuracy of between 90 and 100 points on a diverse set of 100 core test questions, with a strong foundation for preventing hallucinations and for identifying when a question cannot be answered from a passage ( for example, “not found” classification).
He CONTINUE The model family joins two others LLMWare RAG Model Collections: EXPENSIVE and Industry-BERT. He EXPENSIVE The models are smaller RAG-specialized LLM models (1B – 3B) that do not require GPUs and can run on a developer’s laptop. Since the training methodology is very similar, the intention is that a developer can start with a local EXPENSIVE model, running it on your laptop and then seamlessly inserting a CONTINUE model for higher production performance. CONTINUE All models have been designed for private deployment on a single enterprise-grade GPU server, so that enterprises can deploy an end-to-end RAG system, securely and privately in their own security zone.
This set of open source RAG specialized models, combined with LLMWare’s core development framework and out-of-the-box integration with Milvus and Mongo DB open source private cloud instances, provides an end-to-end solution for TRAPO. With a few lines of code, a developer can automate the ingestion and analysis of thousands of documents, attach embedding vectors, run generative inference based on next-generation LLM, and run testing and source verification, all in a private cloud. and, in some cases, even from a single developer’s laptop.
According to Darren Oberst, CEO of ai Bloks, “We believe LLMs enable a new automation workflow in the enterprise, and our vision for ai/llmware” target=”_blank” rel=”noreferrer noopener”>LLMware is to bring together the specialized models, data flow, and all enabling components into a unified framework in open source to enable enterprises to quickly customize and deploy LLM-based automation at scale.”
For more information, see the llmware github repository at ai/llmware.git” target=”_blank” rel=”noreferrer noopener”>www.github.com/llmware-ai/llmware.git.
To directly access the models, see the Huggingface llmware organization page on www.huggingface.co/llmware.
Thanks to ai Block for the educational/thought leadership article. ai Block has supported us in this content/article.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for social good. His most recent endeavor is the launch of an ai media platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has more than 2 million monthly visits, which illustrates its popularity among the public.
<!– ai CONTENT END 2 –>