The discovery of new materials is crucial to addressing pressing global challenges, such as climate change and advances in next-generation computing. However, existing computational and experimental approaches face significant limitations in efficiently exploring the vast chemical space. While ai has become a powerful tool for materials discovery, the lack of publicly available data and open, pre-trained models has become a major bottleneck. Density functional theory (DFT) calculations, essential for studying the stability and properties of materials, are computationally expensive, restricting their usefulness in exploring large materials search spaces.
Researchers at Meta Fundamental ai Research (FAIR) have presented the Open Materials 2024 (OMat24) dataset, which contains over 110 million DFT calculations, making it one of the largest publicly available datasets in this domain. . They also introduce the EquiformerV2 model, a state-of-the-art graph neural network (GNN) trained on the OMat24 dataset, which achieves leading results on the Matbench Discovery leaderboard. The data set includes various atomic configurations taken from both equilibrium and nonequilibrium structures. The accompanying pretrained models are capable of predicting properties such as ground state stability and formation energies with high accuracy, providing a solid foundation for the broader research community.
The OMat24 dataset comprises more than 118 million atomic structures labeled with cellular energies, forces, and strains. These structures were generated using techniques such as Boltzmann sampling, ab-initio molecular dynamics (AIMD), and ratchet structure relaxation. The data set emphasizes nonequilibrium structures, ensuring that models trained on OMat24 are suitable for dynamic and far-from-equilibrium properties. The elemental composition of the data set spans much of the periodic table, focusing on bulk inorganic materials. EquiformerV2 models, trained on OMat24 and other datasets such as MPtraj and Alexandria, have demonstrated high efficiency. For example, models trained with additional denoising objectives showed improvements in predictive performance.
When evaluated on the Matbench Discovery benchmark, the EquiformerV2 model trained with OMat24 achieved an F1 score of 0.916 and a mean absolute error (MAE) of 20 meV/atom, establishing new benchmarks for predicting material stability. These results were significantly better compared to other models in the same category, highlighting the advantage of pre-training on a large and diverse data set like OMat24. Furthermore, models trained solely on the MPtraj dataset, a relatively smaller dataset, also performed well due to effective data augmentation strategies such as nonequilibrium structure denoising (DeNS). Detailed metrics showed that OMat24 pre-trained models outperform conventional models in terms of accuracy, particularly for nonequilibrium configurations.
The introduction of the OMat24 dataset and corresponding models represents a major advance in ai-assisted materials science. The models provide the ability to predict critical properties, such as formation energies, with a high degree of accuracy, making them very useful in accelerating materials discovery. Importantly, this open source version allows the research community to take advantage of these advances, further enhancing the role of ai in addressing global challenges through new materials discoveries.
The OMat24 dataset and models, available at hugging facealong with checkpoints for pre-trained modelsprovides an essential resource for ai researchers in materials science. Meta's FAIR Chem team has made these resources available under permissive licenses, allowing for broader adoption and use. Additionally, an update from the OpenCatalyst team on x can be found. x.com/OpenCatalyst/status/1847323490547876324″>hereproviding more context on how models are pushing the limits of material stability prediction.
look at the Paper. All credit for this research goes to the researchers of this project. Also, don't forget to follow us on twitter.com/Marktechpost”>twitter and join our Telegram channel and LinkedIn Grabove. If you like our work, you will love our information sheet.. Don't forget to join our SubReddit over 50,000ml.
(Next live webinar: October 29, 2024) Best platform to deliver optimized models: Predibase inference engine (promoted)
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of artificial intelligence for social good. Their most recent endeavor is the launch of an ai media platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is technically sound and easily understandable to a wide audience. The platform has more than 2 million monthly visits, which illustrates its popularity among the public.
<script async src="//platform.twitter.com/widgets.js” charset=”utf-8″>