Skip to main content

Novel machine learning technique to identify structural similarities and trends in materials

 

Low-dimensional uniform manifold approximation projection showing symmetry-aware image similarity from a database of greater than 25,000 piezoresponse force microscopy images. Credit: Joshua Agar/Lehigh University

A novel neural network to understand symmetry, speed materials research.

Using a large, unstructured dataset gleaned from 25,000 images, scientists demonstrate a novel machine learning technique to identify structural similarities and trends in materials for the first time.

Understanding structure-property relations is a key goal of materials research, according to Joshua Agar, a faculty member in Lehigh University’s Department of Materials Science and Engineering. And yet currently no metric exists to understand the structure of materials because of the complexity and multidimensional nature of structure.

Artificial neural networks, a type of machine learning, can be trained to identify similarities―and even correlate parameters such as structure and properties―but there are two major challenges, says Agar. One is that the majority of vast amounts of data generated by materials experiments are never analyzed. This is largely because such images, produced by scientists in laboratories all over the world, are rarely stored in a usable manner and not usually shared with other research teams. The second challenge is that neural networks are not very effective at learning symmetry and periodicity (how periodic a material’s structure is), two features of utmost importance to materials researchers.

Now, a team led by Lehigh University has developed a novel machine learning approach that can create similarity projections via machine learning, enabling researchers to search an unstructured image database for the first time and identify trends. Agar and his collaborators developed and trained a neural network model to include symmetry-aware features and then applied their method to a set of 25,133 piezoresponse force microscopy images collected on diverse materials systems over five years at the University of California, Berkeley. The results: they were able to group similar classes of material together and observe trends, forming a basis by which to start to understand structure-property relationships.

“One of the novelties of our work is that we built a special neural network to understand symmetry and we use that as a feature extractor to make it much better at understanding images,” says Agar, a lead author of the paper where the work is described: “Symmetry-Aware Recursive Image Similarity Exploration for Materials Microscopy,” published today in Nature Computational Materials Science. In addition to Agar, authors include, from Lehigh University: Tri N. M. Nguyen, Yichen Guo, Shuyu Qin and Kylie S. Frew and, from Stanford University: Ruijuan Xu. Nguyen, a lead author, was an undergraduate at Lehigh University and is now pursuing a Ph.D. at Stanford.

The team was able to arrive at projections by employing Uniform Manifold Approximation and Projection (UMAP), a non-linear dimensionality reduction technique. This approach, says Agar, allows researchers to learn “…in a fuzzy way, the topology and the higher-level structure of the data and compress it down into 2D.”

“If you train a neural network, the result is a vector, or a set of numbers that is a compact descriptor of the features. Those features help classify things so that some similarity is learned,” says Agar. “What’s produced is still rather large in space, though, because you might have 512 or more different features. So, then you want to compress it into a space that a human can comprehend such as 2D, or 3D―or, maybe, 4D.”

By doing this, Agar and his team were able to take the 25,000-plus images and group very similar classes of material together.

“Similar types of structures in material are semantically close together and also certain trends can be observed particularly if you apply some metadata filters,” says Agar. “If you start filtering by who did the deposition, who made the material, what were they trying to do, what is the material system…you can really start to refine and get more and more similarity. That similarity can then be linked to other parameters like properties.”

This work demonstrates how improved data storage and management could rapidly accelerate materials discoveries. According to Agar, of particular value are images and data generated by failed experiments.

“No one publishes failed results and that’s a big loss because then a few years later someone repeats the same line of experiments,” says Agar. “So, you waste really good resources on an experiment that likely won’t work.”

Instead of losing all of that information, the data that has already been collected could be used to generate new trends that have not been seen before and speed discovery exponentially, says Agar. 

This study is the first “use case” of an innovative new data-storage enterprise housed at Oak Ridge National Laboratory called DataFed. DataFed, according to its website is “…a federated, big-data storage, collaboration, and full-life-cycle management system for computational science and/or data analytics within distributed high-performance computing (HPC) and/or cloud-computing environments.” 

“My team at Lehigh has been part of the design and development of DataFed in terms of making it relevant for scientific use cases,” says Agar. “Lehigh is the first live implementation of this fully-scalable system. It’s a federated database so anyone can pop up their own server and be tied to the central facility.”

Agar is the machine learning expert on Lehigh University’s Presidential Nano-Human Interface Initiative team. The interdisciplinary initiative, integrating the social sciences and engineering, seeks to transform the ways that humans interact with instruments of scientific discovery to accelerate innovations.

“One of the key goals of Lehigh’s Nano/Human Interface Initiative is to put relevant information at the fingertips of experimentalists to provide actionable information that allows more informed decision-making and accelerates scientific discovery,” says Agar. “Humans have limited capacity for memory and recollection. DataFed is a modern-day Memex; it provides a memory of scientific information that can easily be found and recalled.”

DataFed provides an especially powerful and invaluable tool for researchers engaged in interdisciplinary team science, allowing researchers who are collaborating on team projects located in different/remote locations to access each other’s raw data.  This is one of the key components of our Lehigh Presidential Nano/Human Interface (NHI) Initiative for accelerating scientific discovery,” says Martin P. Harmer, Alcoa Foundation Professor in Lehigh’s Department of Materials Science and Engineering and Director of the Nano/Human Interface Initiative.    

Reference: “Symmetry-aware recursive image similarity exploration for materials microscopy” 8 October 2021, npj Computational Materials.

DOI: 10.1038/s41524-021-00637-y


The work described was supported by the Lehigh University Nano/Human Interface Presidential Initiative and a National Science Foundation grant under TRIPODS + X.

Comments

Popular posts from this blog

Wildlife conservation on ice: frozen zoos to save animals

  On the edge: Disease and habitat loss is decimating wild amphibian populations globally, with more than 200 species needing urgent intervention through captive breeding, says Dr. Simon Clulow. In a south-eastern suburb in Melbourne, there’s a zoo. It has no visitors, and there are no animals anywhere inside it. Rather, the Australian Frozen Zoo houses living cells and genetic material from Australian native and rare and exotic species. This place, and others like it, could be a big part of the future of conservation. Department of Biological Sciences’ Simon Clulow and his colleagues make the case for ‘biobanking’ in a recent piece in Conservation Letters. Clulow is keen to stress that this doesn’t mean getting rid of conventional zoos or captive breeding programs. “Captive breeding has had some wonderful successes, and there will always be a huge place for it,” he says. PhD student and lead author Lachlan Howell agrees. “It was captive breeding that brought the giant panda back from

Insects are terrified of fish

ScienceDaily   — The mere presence of a predator causes enough stress to kill a dragonfly, even when the predator cannot actually get at its prey to eat it, say biologists at the University of Toronto. "How prey respond to the fear of being eaten is an important topic in ecology, and we've learned a great deal about how these responses affect predator and prey interactions," says Professor Locke Rowe, chair of the Department of Ecology and Evolutionary Biology (EEB) and co-principal investigator of a study conducted at U of T's Koffler Scientific Reserve. "As we learn more about how animals respond to stressful conditions -- whether it's the presence of predators or stresses from other natural or human-caused disruptions -- we increasingly find that stress brings a greater risk of death, presumably from things such as infections that normally wouldn't kill them," says Rowe. Shannon McCauley, a post-doctoral fellow, and EEB professo

Nasa’s Mars perseverance “Kodiak” moment – Jezero Crater’s Lake is more complicated and intriguing than thought

The escarpment the science team refers to as “Scarp a” is seen in this image captured by Perseverance rover’s Mastcam-Z instrument on April 17, 2021. Credit: NASA/JPL-Caltech/ASU/MSSS Pictures from NASA’s latest six-wheeler on the Red Planet suggest the area’s history experienced significant flooding events. A new paper from the science team of NASA’s Perseverance Mars rover details how the hydrological cycle of the now-dry lake at Jezero Crater is more complicated and intriguing than originally thought. The findings are based on detailed imaging the rover provided of long, steep slopes called escarpments, or scarps in the delta, which formed from sediment accumulating at the mouth of an ancient river that long ago fed the crater’s lake. The images reveal that billions of years ago, when Mars had an atmosphere thick enough to support water flowing across its surface, Jezero’s fan-shaped river delta experienced late-stage flooding events that carried rocks and debris into it from the hi