Scalable pre-training of large autoregressive image models

This article presents AIM, a collection of vision models pre-trained with an autoregressive objective. These models are inspired by their textual counterparts, i.e., large language models (LLMs), and exhibit similar scaling properties. Specifically, we highlight two key findings: (1) visual feature performance scales with both model capacity and amount of data, (2) objective function value correlates with model performance on subsequent tasks. We illustrate the practical implication of these findings by pre-training a 7 billion parameter AIM on 2 billion images, which reaches 84.0% on ImageNet-1k with a frozen log. Interestingly, even at this scale, we see no signs of performance saturation, suggesting that AIM potentially represents a new frontier for training large-scale vision models. AIM pre-training is similar to LLM pre-training and does not require any image-specific strategy to stabilize training at scale.

Scalable pre-training of large autoregressive image models

Technical Terrence Team

At 305p, should we "kiss, marry or avoid" Rolls-Royce shares?

Leave a Reply Cancel reply

Recommended.

FTC Solar Sinks After Reporting Wider-Than-Expected Loss in Third Quarter; The CEO and CFO will leave

Vitalik Buterin previews Ethereum's Verkle Trees upgrade, hints at AI integration

Bitcoin price is virtually unchanged at $23,300 – will this week’s PMI news fuel the rally?

XRP and ADA face headwinds; Investors flock to Borroe Finance

This AI article explores the impact of model compression on the robustness of subgroups in BERT language models

Categories

Important Links

Scalable pre-training of large autoregressive image models

Related

Technical Terrence Team

At 305p, should we "kiss, marry or avoid" Rolls-Royce shares?

Leave a Reply Cancel reply

Recommended.

FTC Solar Sinks After Reporting Wider-Than-Expected Loss in Third Quarter; The CEO and CFO will leave

Vitalik Buterin previews Ethereum's Verkle Trees upgrade, hints at AI integration

Bitcoin price is virtually unchanged at $23,300 – will this week’s PMI news fuel the rally?

XRP and ADA face headwinds; Investors flock to Borroe Finance

This AI article explores the impact of model compression on the robustness of subgroups in BERT language models

Categories

Important Links

Get daily news updates to your inbox!