LLM Compression: The truth is rarely pure and never simple

Despite their notable achievements, modern large language models (LLMs) encounter exorbitant computational and memory footprints. Recently, several works have shown significant success in compressing (pruning and quantizing) LLM without training and without data, achieving 50-60% sparsity and reducing the bit width to 3 or 4 bits per weight, with a degradation of negligible perplexity over the uncompressed baseline. As recent research efforts focus on the development of increasingly sophisticated compression methods, our work takes a step back and reevaluates the effectiveness of existing SoTA compression methods, which are based on a fairly simple and broadly based metric. questioned: perplexity (even for dense LLMs). . We present the Knowledge-Intensive Compressed LLM Benchmark (LLM-KICK), a collection of tasks carefully selected to redefine the evaluation protocol for compressed LLMs, which have significant alignment with their dense counterparts, and perplexity fail to capture subtle changes in their true capabilities. LLM-KICK reveals many favorable merits and unfortunate situations of current SoTA compression methods: all pruning methods suffer significant performance degradation, sometimes at trivial sparsity ratios (e.g., 25-30%), and fail by N:M scarcity in knowledge-intensive tasks. ; Current quantification methods are more successful than pruning; however, trimmed LLMs, even with a 50% shortage, are robust in-context summary and retrieval systems; among others. LLM-KICK is designed to comprehensively access the ability of compressed LLMs to understand, reason, generate, retrieve in context, summarize in context, etc. We hope that our study can encourage the development of better LLM compression methods.

LLM Compression: The truth is rarely pure and never simple

Technical Terrence Team

£8,000 savings? Here's how you would use it to reach an annual passive income of £5,980

Leave a Reply Cancel reply

Recommended.

Peach is a solid addition to Lego’s Super Mario lineup

The Sonos Roam 2 may launch in June

PENGU Pumps, FARTCOIN Regains $1 Billion Market Cap as Flockerz Meme Coin Pre-Sale Hits $8.4 Million

SEI and XRP Fall as Pre-Sale Trend on MEDA Shows Potential to Lead Market Gains

Los 10 mejores relojes inteligentes con tecnología de inteligencia artificial en la India en 2024

Categories

Important Links

LLM Compression: The truth is rarely pure and never simple

Related

Technical Terrence Team

£8,000 savings? Here's how you would use it to reach an annual passive income of £5,980

Leave a Reply Cancel reply

Recommended.

Peach is a solid addition to Lego’s Super Mario lineup

The Sonos Roam 2 may launch in June

PENGU Pumps, FARTCOIN Regains $1 Billion Market Cap as Flockerz Meme Coin Pre-Sale Hits $8.4 Million

SEI and XRP Fall as Pre-Sale Trend on MEDA Shows Potential to Lead Market Gains

Los 10 mejores relojes inteligentes con tecnología de inteligencia artificial en la India en 2024

Categories

Important Links

Get daily news updates to your inbox!