Most state-of-the-art weather and climate models rely heavily on simulations of massive number systems that use the laws of physics to govern different aspects of the atmosphere. Because of this, running state-of-the-art numerical weather and climate models is extremely computationally expensive, especially when simulating atmospheric phenomena with fine-grained spatial and temporal resolution. Therefore, despite their extraordinary performance, it is recognized that these models have several shortcomings and limitations that apply to both short- and long-term time horizons.
The amount of data that can be collected using satellites, radar, and various weather sensors has also increased considerably as a result of recent technological advances. These data-driven methods use deep neural networks to train a data-driven functional mapping to solve a forecasting or posterior projection task. However, there are several restrictions on how current numerical weather and climate models can handle large-scale data. To counter this problem, machine learning (ML) models can offer an alternative trade-off to take advantage of data and computation scalability. These efforts to extend deep learning systems for short- and medium-range weather forecasting have already shown outstanding success, often matching the most advanced numerical weather models.
However, most ML models lack the generality of numerical models because they are trained for specific spatiotemporal targets using carefully selected climate data sets. In order to build a more generalized model for weather and climate science, Microsoft researchers worked on the development of ClimaX. ClimaX is a generalizable transformer-based weather and climate science model that can be trained on heterogeneous data sets spanning multiple variables, space-time coverage, and physical grounds. The basic model can be tuned to accommodate a wide range of weather and climate requirements, allowing it to be computationally efficient while maintaining universality. The model will be available shortly for use in academia and research.
ClimaX uses the pretraining and fine-tuning paradigm, which has recently gained popularity for training basic unsupervised models. The researchers used climate simulation data sets that use the underlying laws of physics rather than being limited to conventional homogeneous weather data sets for ClimaX pre-training. The benefit of doing so was the abundance of data available due to various climate simulations from numerous groups. The researchers used the CMIP6-derived climate data sets for this purpose. After that, the pretrained ClimaX can be tuned to handle various weather and climate tasks, including those that incorporate atmospheric variables and spatiotemporal scales that were not considered during pretraining.
ClimaX is a multidimensional architecture for image-to-image translation based on Vision Transformers (ViT). However, ClimaX differs from standard ViT architectures in two important aspects: variable tokenization and variable aggregation. Unlike common image data, where ViT tokenization involves dividing all inputs into equal patches and flattening these patches, the researchers used variable tokenization for weather data. Since weather data can be quite irregular, variable tokenization treats variables as discrete modalities to allow more flexible training even with inconsistent data sets. However, variable tokenization has two drawbacks. It produces sequences that lengthen linearly with the number of input variables, which is incredibly computationally inefficient. Also, the input will likely comprise tokens of many variables with widely disparate physical backgrounds. Therefore, the researchers suggested variable aggregation, a cross-attention process that generates an embedding vector of similar size for each spatial location.
Weather forecasts, climate projections, and climate downscaling were among the post-climate tasks on which the researchers evaluated ClimaX’s performance. Even when pretrained with lower resolutions and compute budgets, ClimaX performs better than other basic deep learning models.
Microsoft developed ClimaX with the intention of advancing data-driven weather and climate modeling by enabling universal access to cutting-edge machine learning techniques that handle a variety of challenges involving weather and climate variables. The team explained that they view ClimaX as a first step in completing many of these types of tasks. More findings regarding his research can be found below.
review the Paper Y microsoft blog. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 13k+ ML SubReddit, discord channel, Y electronic newsletterwhere we share the latest AI research news, exciting AI projects, and more.
Khushboo Gupta is a consulting intern at MarktechPost. He is currently pursuing his B.Tech at the Indian Institute of Technology (IIT), Goa. She is passionate about the fields of machine learning, natural language processing, and web development. She likes to learn more about the technical field by participating in various challenges.