For more than 100 years, scientists have been using x-ray crystallography to determine the structure of crystalline materials such as metals, rocks, and ceramics.
This technique works best when the crystal is intact, but in many cases scientists only have a powdered version of the material, which contains random fragments of the crystal, making it harder to reconstruct the overall structure.
MIT chemists have devised a new generative artificial intelligence model that can greatly facilitate determining the structures of these powdered crystals. The prediction model could help researchers characterize materials for use in batteries, magnets, and many other applications.
“Structure is the first thing you need to know about any material. It’s important for superconductivity, it’s important for magnets, it’s important for knowing what photovoltaic system you’ve created. It’s important for any application you can think of that’s centered on materials,” says Danna Freedman, the Frederick George Keyes Professor of Chemistry at MIT.
Freedman and Jure Leskovec, a professor of computer science at Stanford University, are the lead authors of the new study, which appears today in the Journal of the American Chemical SocietyMIT graduate student Eric Riesel and Yale University undergraduate student Tsach Mackey are the paper's lead authors.
Distinctive patterns
Crystalline materials, including metals and most other inorganic solids, are made up of networks consisting of many identical, repeating units. These units can be thought of as “boxes” with a distinctive shape and size, with atoms precisely arranged inside them.
When x-rays are shined onto these lattices, they diffract off the atoms at different angles and intensities, revealing information about the positions of the atoms and the bonds between them. Since the early 20th century, this technique has been used to analyse materials, including biological molecules that have a crystalline structure, such as DNA and some proteins.
For materials that only exist as powdered crystals, solving these structures becomes much more difficult because the fragments do not have the full 3D structure of the original crystal.
“The precise lattice still exists, because what we call dust is actually a collection of microcrystals. So it has the same lattice as a large crystal, but they are in a completely random orientation,” Freedman says.
x-ray diffraction patterns exist for thousands of these materials, but they remain unsolved. To try to decipher the structures of these materials, Freedman and his colleagues trained a machine learning model on data from a database called the Materials Project, which contains more than 150,000 materials. First, they fed tens of thousands of these materials into an existing model that can simulate what x-ray diffraction patterns would look like. Then, they used those patterns to train their artificial intelligence model, which they call Crystalyze, to predict structures based on the x-ray patterns.
The model divides the structure prediction process into several subtasks. First, it determines the size and shape of the lattice “box” and which atoms will fit into it. Next, it predicts the arrangement of the atoms within the box. For each diffraction pattern, the model generates several possible structures, which can be tested by feeding the structures into a model that determines the diffraction patterns for a given structure.
“Our model is generative ai, meaning it generates something it hasn’t seen before and that allows us to generate multiple different assumptions,” Riesel says. “We can make a hundred assumptions and then we can predict what the dust pattern should look like for our assumptions. And then if the input looks exactly like the output, then we know we got it right.”
Solve unknown structures
The researchers tested the model on several thousand simulated diffraction patterns from the Materials Project. They also tested it on more than 100 experimental diffraction patterns from the RRUFF database, which contains powder x-ray diffraction data for nearly 14,000 naturally occurring crystalline minerals, which they had retained outside the training data. On this data, the model was accurate about 67 percent of the time. Next, they began testing the model on diffraction patterns that had not been resolved before. This data came from the Powder Diffraction Archive, which contains diffraction data for more than 400,000 resolved and unresolved materials.
Using their model, the researchers devised structures for more than 100 of these previously unsolved patterns. They also used their model to discover structures for three materials that Freedman's lab created by forcing elements that do not react at atmospheric pressure to form compounds under high pressure. This approach can be used to generate new materials that have radically different crystal structures and physical properties, even though their chemical composition is the same.
Graphite and diamond, both compounds of pure carbon, are examples of such materials. The materials Freedman has developed, which contain bismuth and another element, could be useful in designing new materials for permanent magnets.
“We found many new materials from existing data and, most importantly, we solved three previously unknown structures from our lab that comprise the first new binary phases of these element combinations,” Freedman says.
Being able to determine the structures of powdered crystalline materials could help researchers working in almost any field related to materials, according to the MIT team, which has published a web interface for the model on cristalyze.org.
The research was funded by the U.S. Department of Energy and the National Science Foundation.