Machine Learning in Materials Science: A Second Computational Revolution?

Let’s get this straight: I’m an old-school computational chemist. I’ve done my time with semi-empirical calculations, Hartree-Fock, MP2, played around with DFT and molecular dynamics, even dabbled in docking – you get the drift. I’m no stranger to sweating over making calculations work, scouring odd forums to troubleshoot, and diving into a sea of articles to pick the right functionals and basis sets. And then, there’s the waiting game – days, sometimes weeks, for DFT or molecular dynamics to churn out results. Headaches? Had my fair share, for sure. But does it have to be this way? Perhaps it’s time for a change. I’m talking about weaving more machine learning into our computational materials science toolkit. Skeptical? I was too. But the more I read, the more I’m convinced. So, here a thought-provoking question for you, dear reader: is machine learning in materials science kicking off the second computational revolution? Inspired by Marques’ review article on Nature, today we’re diving into this question. Ready to roll? Let’s go!

The journey of machine learning in materials science is just at the beginning. Yet, its impact is already clear. | From Atoms To Words | Arturo Robertazzi

Machine Learning in Materials Science: Why?

Imagine a world where machines learn like humans, where algorithms are not just tools, but maestros of data, conducting symphonies of numbers and patterns. This is no longer the realm of science fiction; it’s becoming our reality.

These digital virtuosos are seamlessly integrated into our daily lives, enhancing everything from speech recognition to web searches, even refining the mundane task of filtering emails. Machine learning’s influence extends deep into science, mastering complex tasks like protein structure prediction and weather forecast, even potentially disrupting drug discovery.

So, what about materials science?

As we have seen in previous stories on From Atoms To Words, the search for a new material often resembles a treasure hunt, heavily reliant on trial and error, human intuition, and serendipity. Think of Rosenberg’s cisplatin or Geim’s graphene.

But then came the first computational revolution, marked by powerful computational methods like DFT, Monte Carlo simulations, and molecular dynamics. When allied with experiments, these techniques significantly accelerated materials research.

This was a game-changer, setting the stage for something even bigger: yes, the emergence of machine learning. Smart algorithms are now coming to the forefront, guiding us towards what Marques describes as the second computational revolution.

But why do we need machine learning in materials science?

Because the sheer number of materials we can design and create is astronomically high, potentially exceeding googol (10¹⁰⁰) levels. This means that the application of machine learning is not just an academic curiosity but essential for overcoming the barriers to the next breakthroughs.

I get it, here we go again: the AI buzz, right? Well, let’s add a splash of reality here. The truth is, machine learning’s journey in materials science is just at the beginning. Most applications we see today are still pretty basic, focusing on small datasets or composition spaces often within the reach of our old-school computational methods.

But here’s the exciting part: machine learning is gradually weaving itself into the very fabric of materials science, lowering those barriers to the next breakthroughs.

This brings us to an intriguing juncture, a moment ripe for pondering: Are we on the brink of a second computational revolution in materials science?

Before we dive deeper into that, let’s first explore what machine learning in materials science can actually achieve.

More on From Atoms To Words:
▸ When Will RNA Structure Prediction Get Its AlphaFold Breakthrough?
▸ AI in Drug Discovery: Chasing Dreams, Facing Realities
▸ Chemical Space to Material Discovery: Simulations and Machine Learning Leading the Way

Machine Learning in Materials Science: What Can It Actually Do?

Marques’ review article is a whopping 36-page journey through the latest in machine learning within materials science. We are talking a blend of theory, historical insights, and practical examples, from predicting crystal structures to accelerating first-principle calculations. I’ve sifted through this treasure trove, cherry-picking the juiciest bits to share with you.

So, let’s get our hands dirty and dive into the incredible feats machine learning is achieving in the world of materials science:

▸ Crystal structure prediction

Rewind to about thirty years ago: John Maddox, the then editor of Nature, slammed the inability to predict a solid’s crystal structure from its chemical composition as a scandal. But why fuss about a material’s crystal structure, anyway?

Because knowing the crystal structure is like having the blueprint for building new materials – it’s a crucial milestone for achieving truly rational material design.

But here’s the kicker: the riddle of deciphering this blueprint from just the chemical composition is still a brain-twister.

The culprit? The overwhelming complexity of chemical space. When trying to predict crystal structures from scratch, you’re exploring a seemingly endless combinatorial space of atomic arrangements in three dimensions, each with its unique and intricate energy landscape.

It’s like trying to find a grain of sand of a very specific blue in the Sahara desert.

The good news? Recent years have brought us serious advances in structure selection and generation algorithms – think random sampling, simulated annealing, metadynamics, minima hopping, and evolutionary algorithms. These tools have expanded the capabilities of traditional crystal structure prediction methods. But (and it’s a big but), they’re power-hungry beasts, craving substantial computational resources.

That’s where machine learning comes in.

Smart models can learn from the outcomes of first-principle methods and then replicate – or possibly extend – these results almost instantaneously. This acceleration is what we need to leapfrog to the next level of material design and discovery we previously could only dream of.

Yet, let’s not sugarcoat it: crystal structure prediction is still like solving a Rubik’s cube blindfolded. But wait, there’s a ray of hope piercing through the clouds.

As I was penning down today’s story, Google DeepMind announced the discovery of 2.2 million new crystals, a feat they’re calling nearly 800 years’ worth of knowledge in a single swoop. This was made possible by their latest innovation, the Graph Networks for Materials Exploration (GNoME). It’s a deep learning system that’s redefining efficiency in predicting new materials’ stability. We’re definitely unpacking the nuts and bolts of this in a future story. If you cannot wait, check out this Nature article that discusses GNoMe.

▸ Component and Stability Prediction

Rather than venturing into the vast universe of structural possibilities, you could instead start with a prototype structure and then embark on a quest through the composition space to hunt down stable materials. In solid-state physics, this approach goes by the name of component prediction.

The protagonist here is thermodynamic stability – the energetic descriptor that defines how a material can resist decomposition into different phases or compounds over time.

Now, let’s not kid ourselves: predicting the stability of a new compound is no small feat, but machine learning might just be the ace up our sleeve.

Case in point: Faber and team harnessed Kernel Ridge Regression to calculate formation energies for a whopping two million elpasolites crystals. The precision? Simply astounding. They reported an error margin of only about 0.1 eV/atom for a training set of 104 compositions, leading them to unveil 90 new stoichiometries.

Likewise, Schmidt et al. took a deep dive into 250,000 cubic perovskites, armed with a dataset of DFT calculations. Their winning formula, a mix of extremely randomized trees and adaptive boosting, yielded very accurate results (with a mean average error of 0.12 eV/atom).

More on From Atoms To Words:
▸ Large Language Models for Chemistry: Is the Beginning of a New Era?
▸ 60 Years in the Making: AlphaFold’s Historical Breakthrough in Protein Structure Prediction
▸ Is Machine Learning Going to Replace Computational Chemists?

▸ Material Property Prediction

Now, the potential of machine learning in materials science is not just of academic interest. Its predictions may have real-world applications. Think energy storage, carbon capture, and more. So, what material properties could machine learning predict?

Let’s list some of the key descriptors highlighted by Marques:

Band Gaps. The design of functional materials, such as those used in light-emitting diodes, photovoltaics, and transistors, hinges on the estimation of band gaps. Band gaps determine how these materials interact with light and electricity.

Bulk and Shear Moduli. These parameters are essential in understanding how materials respond under stress and strain, crucial for applications ranging from construction to aerospace engineering.

Superconductivity. Perhaps one of the most tantalizing challenges in condensed matter physics is superconductivity, a puzzle that has remained unsolved for decades. Machine learning steps in with a fresh perspective, predicting critical temperatures for superconducting materials without needing a full-blown theoretical model. Despite the challenge of limited data, this approach underscores machine learning’s potential to tackle areas where traditional theories fall short.

▸ Machine Learning Accelerating Simulations

As we’ve discussed before on From Atoms To Words, good ol’ atomistic simulations are great for accuracy but, man, do they eat up computing power. This has historically confined DFT and molecular dynamics to relatively short time scales and small models. To put it another way: running atomistic simulations on classical force fields for seconds, or doing DFT calculations on something as large as a battery cell, well, that’s pretty much out of reach for now.

To tackle these challenges, machine learning is stepping into the ring, meshing with numerical techniques. Let’s dive deeper and see how this plays out:

Machine Learning for Molecular Dynamics. Initial attempts, beginning as early as the 90s, faced challenges with data and architecture optimization. Since then, the field has seen substantial progress, with numerous machine learning potentials being developed. A standout example is Behler and Parrinello’s work, where they employed a multilayer perceptron feedforward neural network to represent a system’s total energy as a sum of atomic contributions. This approach has significantly improved the efficiency and accuracy of simulating larger and more complex systems.

Machine Learning for DFT. What’s the goal here? To create more effective exchange and correlation potential and energy functionals. Pioneering this field, Tozer and colleagues in 1996 employed a one-layer feed-forward multiperceptron neural network. Trained with two distinct datasets totaling 3768 data points, the neural network achieved an impressive 2–3% accuracy. Continuing on this path, similar approaches have reached chemical accuracy in kinetic energy functionals for a variety of systems, including water, benzene, ethane, and malinaldehyde.

More on From Atoms To Words:
▸ Multiscale Simulations of DNA: From Quantum Effects To Mesoscopic Processes
▸ Can Quantum Chemistry Simulations Help Trace the Origin of Life?
▸ Quantum Chemistry of Molecule-Surface Adsorption: The 30-Year Struggle To Chemical Accuracy

A final personal touch

So after going into the details of it all, a question remains. Is machine learning triggering a second computational revolution? I’m usually not one for throwing around the word revolution, but in this case, yeah, I believe it’s happening.

Sure, computational chemistry, quantum chemistry, and simulations will always be key players – they’re the backbone for understanding and rationalizing, say, the quick responses of an AI. They’re also indispensable for generating the rich datasets we need when experimental data is scarce. But here’s my take: the computational tools I grew to love during my PhD are bound to evolve and merge with the path of machine learning.

Echoing Marques and team, it looks like the future of materials science is forking into two distinct paths. On one, where data is plentiful, we’re witnessing the rise of universal models – the heavy hitters capable of predicting a wide range of properties, possibly even eclipsing DFT calculations in accuracy.

But what happens in the data desert?

That’s where scientific creativity blooms. Techniques like active learning and transfer learning seem poised to revolutionize materials science. And there’s a paradigm shift happening too – from predicting properties to gauging the accuracy of simpler models’ predictions. It’s about starting small and scaling up to tackle more complex, real-world challenges.

Here we stand, at the precipice of what might be a monumental change in material discovery and design. It’s only the beginning, yet the impact is already clear – from the subtle art of predicting material properties to deciphering the complex map of crystal structures.

So, fellow computational enthusiasts, the choice is ours: Do we cling to the tried and true methods, or do we leap into the embrace of this second computational revolution?

I’ve made my choice. What’s yours?

If you enjoyed this dive into machine learning for materials science, I’d love to hear your thoughts. Agree, disagree, or have a totally wild theory of your own? Let’s connect! Subscribe to my LinkedIn newsletter and let’s keep the conversation rolling.