MUSCLE: A model update strategy for compatible LLM evolution

Large language models (LLMs) are periodically updated to improve performance, usually through changes to data or architecture. Within the upgrade process, developers typically prioritize improving overall performance metrics, paying less attention to maintaining backward compatibility. Instance-level degradation (instance regression) of performance from one model version to the next can interfere with a user's mental model of the capabilities of a particular language model. Users having to adapt their mental model with each update can lead to dissatisfaction, especially when the new model has degraded compared to a previous version for a known use case (model update regression). We found that when pre-trained LLM base models are updated, optimized user-facing downstream task adapters experience negative changes: instances that were previously correct are now incorrectly predicted. We observe model updating regression between different model versions on a diverse set of tasks and models, even when subsequent task training procedures remain identical. We argue for the importance of maintaining model update compatibility during upgrades and present evaluation metrics designed specifically for generative tasks, while also being applicable to discriminative tasks. We propose a training strategy to minimize the extent of instance regression in model updates, which involves training a compatibility adapter that can improve task-tuned language models. We show that negative changes are reduced by up to 40%, for example, when updating Llama 1 to Llama 2 with our proposed method.

Figure 1: A real example of a model update that introduces instance regression (negative inversion, where a previously correct prediction becomes incorrect) (above). With our model update strategy using a compatibility adapter approach, we improve the compatibility of the model update with the previous model while maintaining the overall performance gain (e.g., measured by the ROUGE-1 score for the summary) of the model update (below).

MUSCLE: A model update strategy for compatible LLM evolution

Technical Terrence Team

NewMarket Corporation Releases Third Quarter 2024 Financial Results By Investing.com

Leave a Reply Cancel reply

Recommended.

LayerZero requires Sybil airdrop farmers to inform themselves

Celebrity Cruises Captain Kate Shares Her Analysis of the Secret Security Test

US SEC Starts Proceedings on Grayscale Ethereum Spot ETF, Extends Decision Deadline

Exploring the issue of censorship on the Ethereum network: SlateCast #43

Solana NFTs reached $5 billion in trading volume; GFOX emerges as GameFi's top contender

Categories

Important Links

MUSCLE: A model update strategy for compatible LLM evolution

Related

Technical Terrence Team

NewMarket Corporation Releases Third Quarter 2024 Financial Results By Investing.com

Leave a Reply Cancel reply

Recommended.

LayerZero requires Sybil airdrop farmers to inform themselves

Celebrity Cruises Captain Kate Shares Her Analysis of the Secret Security Test

US SEC Starts Proceedings on Grayscale Ethereum Spot ETF, Extends Decision Deadline

Exploring the issue of censorship on the Ethereum network: SlateCast #43

Solana NFTs reached $5 billion in trading volume; GFOX emerges as GameFi's top contender

Categories

Important Links

Get daily news updates to your inbox!