FastViT: A Fast Hybrid Vision Transformer Using Structural Reparametrization

The recent fusion of convolutional and transformer designs has led to constant improvements in model accuracy and efficiency. In this work, we present FastViT, a hybrid vision transformer architecture that achieves the state-of-the-art latency-accuracy trade-off. To this end, we present a novel token mixing operator, RepMixer, a core component of FastViT, which uses structural reparametrization to reduce the memory access cost by eliminating jump connections in the network. Additionally, we apply training time overparameterization and large kernel convolutions to increase accuracy and show empirically that these options have minimal effect on latency. We show that: our model is 3.5 times faster than CMT, a state-of-the-art hybrid transformer architecture, 4.9 times faster than EfficientNet, and 1.9 times faster than ConvNeXt on a mobile device for the same accuracy. in the ImageNet data set. . With similar latency, our model achieves 4.2% better Top-1 accuracy on ImageNet than MobileOne. Our model consistently outperforms competing architectures on several tasks: image classification, detection, segmentation, and 3D mesh regression with significant improvement in latency on both a mobile device and a desktop GPU. Furthermore, our model is very robust to out-of-distribution samples and corruptions, which improves over competing robust models.

FastViT: A Fast Hybrid Vision Transformer Using Structural Reparametrization

Technical Terrence Team

The dollar index continues its bullish rally to 105.80

Leave a Reply Cancel reply

Recommended.

Las 10 mejores plataformas de minería en la nube de Ethereum en 2024

Bitget Celebrates 15th Bitcoin White Paper Day with “Satoshi” on the Street and BTC Giveaways

Solana eclipses its rivals in the weekend’s top 10 list, with an 18% increase

Uniswap Wins Web3 Battle Of Titans Developer Games Competition

Bitcoin penetrates major macroeconomic capital markets: what about Ethereum?

Categories

Important Links

FastViT: A Fast Hybrid Vision Transformer Using Structural Reparametrization

Related

Technical Terrence Team

The dollar index continues its bullish rally to 105.80

Leave a Reply Cancel reply

Recommended.

Las 10 mejores plataformas de minería en la nube de Ethereum en 2024

Bitget Celebrates 15th Bitcoin White Paper Day with “Satoshi” on the Street and BTC Giveaways

Solana eclipses its rivals in the weekend’s top 10 list, with an 18% increase

Uniswap Wins Web3 Battle Of Titans Developer Games Competition

Bitcoin penetrates major macroeconomic capital markets: what about Ethereum?

Categories

Important Links

Get daily news updates to your inbox!