ai/”>LLMWare.aipioneer in the implementation and tuning of small language models (SLM), today announced the release of Model repository on Hugging Face, one of the largest collections of SLMs optimized for Intel PCs. With over 100 models covering multiple use cases such as chat, coding, math, function calls, and embedded models, Model repository aims to provide the open source ai community with an unprecedented collection of the latest SLMs optimized for Intel-based PCs in Intel's OpenVINO and ONNX formats.
Using LLMWare Model repository combined with LLMWare ai/llmware”>open source library which provides a complete toolset for end-to-end development of ai-enabled workflows, developers can create recovery augmented (RAG) and agent-based workflows using SLM in OpenVINO format for users of Intel hardware. OpenVINO is an open source library for optimizing and implementing inference capabilities of deep learning models, including large and small language models. Specifically designed to reduce resource demands for efficient deployment on a variety of platforms, including on-device and ai-enabled PCs, OpenVINO supports model inference on Intel CPUs, GPUs, and NPUs.
Similarly, ONNX provides an open source format for ai models, both deep learning and traditional machine learning, with a current focus on the capabilities needed for inference. ONNX can be found in many frameworks, tools and hardware and aims to enable interoperability between different frameworks.
In a recent technical report, ai/”>LLMware found that implementing 4-bit quantized small language models (parameters 1B-9B) in the OpenVINO format maximizes model inference performance on Intel ai PCs. When tested on a Dell laptop with Intel Core Ultra 9 (Meteor Lake), using a BLING-Tiny-Llama model of 1.1B parameters, the OpenVINO quantized format led to inference speeds that are up to 7.6 times faster than PyTorch and up to 7.5 times faster than PyTorch. faster than GGUF.
The comparison consistently uses LLMWare's 21-question RAG test. The processing time shows the total execution time of the 21 questions:
Detailed information about ai/”>LLMwareThe testing methodology can be found in the whitepaper.
The goal of LLMWare is to provide a powerful abstraction layer for working with various inference capabilities. By supporting OpenVINO, ONNX, and Llama.cpp all on one platform, developers can take advantage of the highest performing model formats with the specific hardware capabilities of their intended users. With Model Depot, Intel PC developers can access SLMs that are specifically optimized for performing inference on Intel hardware.
Provides OpenVINO and ONNX support for today's most popular SLMs, including Microsoft Phi-3, Mistal, Llama, Yi, and Qwen, as well as specialized LLMWare functionality that calls SLIM models designed for multi-step workflows and the family of DRAGON and BLING models specialized in RAG. LLMWare provides developers with SLMs to easily and seamlessly create workflows that improve productivity and maximize the local capabilities of ai-enabled PCs.
Optimized with powerful integrated GPUs and NPUs that provide the hardware capability to enable ai applications to be deployed on the device, ai PCs enable businesses to deploy many lightweight ai applications locally without exposing sensitive data or requiring data copies. in external systems. This unlocks huge benefits of increased security and significant cost savings.
ai/”>LLMware It also recently announced its strategic collaboration with Intel with the release of Model HQ in limited release for private preview. Designed specifically for ai PCs with Intel Core Ultra processors, the HQ model provides a turnkey no-code kit to run, build, and deploy ai-enabled applications with integrated UI/UX and low-code agent workflow for easy deployment. application. creation. With built-in Chatbot and document search and analysis features, the app comes out of the box, with the ability to launch custom workflows right on the device. Model HQ also comes with many enterprise-ready safety features such as Model Vault for model safety checks, Model Safety Monitor for toxicity and bias detection, hallucination detector, ai explainability data, audit toolkit, and compliance, privacy filters and much more.
“At LLMWare, we strongly believe in lowering the center of gravity of ai to enable local, private, decentralized and self-hosted deployment, with high-quality models and optimized data pipelines for secure, controlled and cost-optimized deployments of lightweight devices. , RAG, Agent and Chat applications customized for companies of all sizes. “We are excited to open source the Model Depot collection to expand access to OpenVino and ONNX packaged models to support the launch of ai-enabled PCs in the coming months,” said Darren Oberst, CTO of LLMWare.
“The rise of generative ai unlocks new application experiences that were not available with previous generations of data processing algorithms. The unique combination of a powerful ai PC platform and optimization software like OpenVINO is a way to get the best features for locally and privately owned LLM implementation without thinking about the optimization details. LLMWare's platform goes a step further by enabling the use of software building blocks and pre-trained models to implement data processing within the final application and save time to market. The combination of the OpenVINO platform and LLMWare truly unlocks the best-performing generative ai capabilities at the application edge,” said Yury Gorbachev, Intel Fellow and OpenVINO Architect at Intel.
Please visit LLMWare ai/llmware” target=”_blank” rel=”noreferrer noopener”>GitHub and hugging face sites for its comprehensive open source library and collection of small language models, as well as ai/”>llmware.ai for the latest white papers and blogs.
Thanks to ai Block for the educational/thought leadership article. ai Block has supported us in this content/article.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for social good. Their most recent endeavor is the launch of an ai media platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has more than 2 million monthly visits, which illustrates its popularity among the public.