This is part 3 of my new multi-part series Towards Mamba state-spatial models for images, videos and time series.
METROamba, the model that will replace the powerful Transformer, has come a long way from the initial idea of using state space models (SSM) in deep learning.
Mamba adds selectivity to state-space models, resulting in Transformer-like performance while maintaining the subquadratic work complexity of SSMs. Its efficient selective scanning is 40x faster than a standard implementation and can achieve 5x higher throughput than a Transformer.
Join me in this deep dive into Mamba, where we’ll discover how selectivity addresses the limitations of older SSMs, how Mamba overcomes the new hurdles that come with those changes, and how we can incorporate Mamba into a modern deep learning architecture.