KIVI – A Plug-and-Play 2-bit KV Cache Quantization Algorithm without the need for any tuning by Technical Terrence Team 04/16/2024 0 Large language models (LLMs) are incredibly useful for tasks like generating text or answering questions. However, they face a big ...
Meet LLM-Blender: A Novel Joint Framework for Consistently Superior Performance by Leveraging the Various Strengths of Multiple Open Source Large Language Models (LLMs) 07/20/2023