Despite their notable achievements, modern large language models (LLMs) encounter exorbitant computational and memory footprints. Recently, several works have shown significant success in compressing (pruning and quantizing) LLM without training and without data, achieving 50-60% sparsity and reducing the bit width to 3 or 4 bits per weight, with a degradation of negligible perplexity over the uncompressed baseline. As recent research efforts focus on the development of increasingly sophisticated compression methods, our work takes a step back and reevaluates the effectiveness of existing SoTA compression methods, which are based on a fairly simple and broadly based metric. questioned: perplexity (even for dense LLMs). . We present the Knowledge-Intensive Compressed LLM Benchmark (LLM-KICK), a collection of tasks carefully selected to redefine the evaluation protocol for compressed LLMs, which have significant alignment with their dense counterparts, and perplexity fail to capture subtle changes in their true capabilities. LLM-KICK reveals many favorable merits and unfortunate situations of current SoTA compression methods: all pruning methods suffer significant performance degradation, sometimes at trivial sparsity ratios (e.g., 25-30%), and fail by N:M scarcity in knowledge-intensive tasks. ; Current quantification methods are more successful than pruning; however, trimmed LLMs, even with a 50% shortage, are robust in-context summary and retrieval systems; among others. LLM-KICK is designed to comprehensively access the ability of compressed LLMs to understand, reason, generate, retrieve in context, summarize in context, etc. We hope that our study can encourage the development of better LLM compression methods.