SaRA: A memory-efficient fine-tuning method for improving pre-trained diffusion models

Recent advances in diffusion models have significantly improved tasks such as image, video, and 3D generation, with pre-trained models such as Stable Diffusion being instrumental. However, adapting these models to new tasks efficiently remains a challenge. Existing fine-tuning approaches (additive, reparameterized, and selective) have limitations such as additional latency, overfitting, or complex parameter selection. One proposed solution involves taking advantage of “temporally ineffective” parameters (those with minimal current impact but the potential to learn new information) by re-enhancing them to improve the generative capabilities of the model without the drawbacks of existing methods.

Researchers from Shanghai Jiao Tong University and Youtu Lab, Tencent, propose SaRA, a fine-tuning method for pre-trained diffusion models. Inspired by model pruning, SaRA reuses “temporally inefficient” parameters with small absolute values by optimizing them using sparse matrices while preserving prior knowledge. They employ a nuclear norm-based low-rank training scheme and a progressive parameter tuning strategy to avoid overfitting. SaRA’s memory-efficient non-structural backpropagation reduces memory costs by 40% compared to LoRA. Experiments on stable diffusion models show SaRA’s superior performance on multiple tasks, requiring only a single line of code modification to implement.

Diffusion models, such as Stable Diffusion, are excellent for imaging tasks but are limited by their large parameter sizes, making complete fine-tuning difficult. Methods such as ControlNet, LoRA, and DreamBooth address this problem by adding external networks or performing fine-tuning to enable controlled generation or adaptation to new tasks. Fine-tuning approaches that use parameters efficiently, such as Addictive Fine-Tuning (AFT) and Reparameterized Fine-Tuning (RFT), introduce low-rank matrices or adapters. At the same time, Selective Fine-Tuning (SFT) focuses on modifying specific parameters. SaRA improves on these methods by reusing inefficient parameters, maintaining model architecture, reducing memory costs, and improving fine-tuning efficiency without additional inference latency.

In diffusion models, “ineffective” parameters, identified by their small absolute values, show minimal impact on performance when pruned. Experiments on stable diffusion models (v1.4, v1.5, v2.0, v3.0) revealed that setting parameters below a certain threshold to zero sometimes even improves generative tasks. The ineffectiveness is due to the randomness of the optimization, not the model structure. Fine-tuning can make these parameters effective again. SaRA, a method, leverages these temporarily ineffective parameters for fine-tuning, using low-rank constraints and progressive fine-tuning to avoid overfitting and improve efficiency, significantly reducing memory and computational costs compared to existing methods such as LoRA.

The proposed method was evaluated on tasks such as backbone fine-tuning, image personalization, and video generation using FID, CLIP, and VLHI metrics. It outperformed existing fine-tuning approaches (LoRA, AdaptFormer, LT-SFT) on all datasets, showing superior task-specific learning and prior preservation. Image and video generation achieved better consistency and avoided artifacts. The method also reduced memory usage and training time by more than 45%. Ablation studies highlighted the importance of progressive parameter tuning and low-rank constraints. Correlation analysis revealed more effective knowledge acquisition than other methods, which improved task performance.

SaRA is a fine-tuning method that efficiently uses parameters and takes advantage of the lowest-impact parameters on pre-trained models. By using a nuclear-norm-based low-rank loss, SaRA avoids overfitting while its progressive parameter tuning improves fine-tuning efficiency. Unstructured backpropagation reduces memory costs, benefiting other selective fine-tuning methods. SaRA significantly improves generative capabilities on tasks such as domain transfer and image editing, outperforming methods such as LoRA. It only requires a one-line code modification for easy integration, demonstrating superior performance on models such as Stable Diffusion 1.5, 2.0, and 3.0 across multiple applications.

Take a look at the ModelAll credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram Channel and LinkedIn GrAbove!. If you like our work, you will love our fact sheet..

Don't forget to join our SubReddit of over 50,000 ml

FREE ai WEBINAR: 'SAM 2 for Video: How to Optimize Your Data' (Wednesday, September 25, 4:00 am – 4:45 am EST)

Sana Hassan, a Consulting Intern at Marktechpost and a dual degree student at IIT Madras, is passionate about applying technology and ai to address real-world challenges. With a keen interest in solving practical problems, she brings a fresh perspective to the intersection of ai and real-life solutions.

FREE ai WEBINAR: 'SAM 2 for Video: How to Optimize Your Data' (Wednesday, September 25, 4:00 am – 4:45 am EST)

SaRA: A memory-efficient fine-tuning method for improving pre-trained diffusion models

Technical Terrence Team

New Zealand's Auckland International Airport to raise $861.8m, signs contract to build new terminal By Reuters

Leave a Reply Cancel reply

Recommended.

¿Qué es una empresa de adquisición de propósito especial (SPAC)?

Cadillac’s mid-range Optiq electric SUV takes inspiration from the high-end Lyriq

Block shares rose 7% despite declining Bitcoin revenue in Q4

Ethereum Gas Fees Hit Six-Month Low Suggests Imminent Altcoin Rally: Santiment

‘Drink to dust’: Startup says it just broke its clay alternative to plastic cups

Categories

Important Links

SaRA: A memory-efficient fine-tuning method for improving pre-trained diffusion models

Related

Technical Terrence Team

New Zealand's Auckland International Airport to raise $861.8m, signs contract to build new terminal By Reuters

Leave a Reply Cancel reply

Recommended.

¿Qué es una empresa de adquisición de propósito especial (SPAC)?

Cadillac’s mid-range Optiq electric SUV takes inspiration from the high-end Lyriq

Block shares rose 7% despite declining Bitcoin revenue in Q4

Ethereum Gas Fees Hit Six-Month Low Suggests Imminent Altcoin Rally: Santiment

‘Drink to dust’: Startup says it just broke its clay alternative to plastic cups

Categories

Important Links

Get daily news updates to your inbox!