Is Machine Learning Going to Replace Computational Chemists?

Have you heard Now and Then, the last Beatles song? Even with just Ringo and Paul still around, AI magic has brought John’s voice back, and suddenly, it’s like we have the Fab Four together again. It’s an emotional mix of nostalgic vibes and latest tech. It makes you wonder how machine learning and AI are shaking things up in many areas of human activity. As someone who grew up with old school computational chemistry, I can’t help but mull over what all this rapid machine learning progress means for the field. It’s everywhere – drug discovery, material design, protein structure prediction… Is machine learning poised to replace computational chemistry? Inspired by the scientific essay I recently read by Heather Kulik, today we’re going to address this jitter-inducing question. Are you ready to roll? Let’s go.

Computational Chemistry: The Good and The Bad

If you’ve been following From Atoms To Words, you’ll know we often talk about quantum chemistry, computational chemistry, and simulations. I can’t hide it – I’m absolutely passionate about it.

I’ve spent years immersing myself in the painful ins and outs of the field, and it’s thrilling to see how it has evolved over the decades from abstract equations to a pivotal tool in chemistry and materials science.

As we have seen in previous stories, computational chemistry allows you to:

This is all thanks to the relentless efforts of early theorists, who, over the years, have catapulted computational chemistry into the limelight.

Remember those times when seeking precision in computations was like a high-stakes wrestling match?

Gone are the days of the old tug-of-war between efficient but approximate DFT functionals versus the accuracy of painfully slow post-Hartree-Fock methods.

In the past, we were often confined to investigating small molecular systems due to the daunting computational demands. Fast forward to today, and the landscape has changed dramatically. Breakthroughs in algorithms and a surge in computing power have expanded our horizons, letting us achieve results with stunning accuracy for larger systems.

Today, computational chemistry is not just more efficient but also more accessible to scientists around the globe. It has become a cornerstone in validating real-world experimental findings.

But (and there’s always a but), let’s not get too carried away. As much as I adore computational chemistry, it’s time to play devil’s advocate for a moment.

The field, like any other, isn’t without its challenges.

In previous stories, we’ve already seen some of these challenges, from handling dynamic and non-equilibrium systems to achieving scalability with large molecular models.

Let’s put it this way. Sometimes, computational chemistry is seen more as a flawed tool than a solid scientific discipline. At times, it might even seem like we’re just fitting data to get the right results but for the wrong reasons.

I’ve seen this happen firsthand in some of my studies, particularly the one in which we discovered that a known DFT functional, my PhD buddy BHandH, was able to reproduce high-level methods in describing π-stacking interactions in DNA nucleobases… But this is a story for another day.

Now, beyond the intrinsic challenges, there are broader issues, in my opinion, that make computational chemistry a bit of a tough sell in the industrial world:

Computational Demand:
Despite progress, computations like DFT, semi-empirical methods, or even molecular dynamics, can still be seen as too demanding. In an industrial context, waiting hours, or perhaps days, for an output, like an OCV curve of an electrode or viscosity of an electrolyte formulation, isn’t always practical. Think about running a million calculations, each taking ten minutes – well, it adds up!
High-level Expertise:
It’s not just about having unlimited computing power. The computational know-how required for calculations and simulations is a jungle, man. Computational chemistry works if you know how to pick your DFT functional, construct your model, interpret results, gauge confidence, and link it all back to the experiments. You get a new software, state-of-the-art and all, but then what? What do you do next? You still need a PhD to run this show. (Or do you?)

These challenges can make the practical application of computational chemistry in industrial R&D less viable.

So, how do we support and evolve computational chemistry to overcome these hurdles?

More on From Atoms To Words:
▸ Quantum Chemistry of Molecule-Surface Adsorption: The 30-Year Struggle To Chemical Accuracy
▸ Computational Chemistry 2043: A Quantum Peep into the Future
▸ When Will RNA Structure Prediction Get Its AlphaFold Breakthrough?

Is Machine Learning The Solution?

We, computational chemists, excel at one thing: generating data. So, imagine skilled computational chemists, deeply versed in theory and simulation techniques, meticulously crafting data to one specific aim: to train machine learning models.

Now, these machine learning models could straightforwardly be employed in a “black-box” manner by the industry. They would not overshadow the deep knowledge of computational chemists and yet offer a streamlined, efficient route to new insights and solutions—instantaneously.

Picture opening your friend chemGPT to ask:

What’s the range of electrode materials with an average open-circuit voltage of [insert value]?
What’s the optimal electrolyte formulation with a viscosity of [insert value]?
Which metal surface has the highest adsorption energy for [insert molecule]?

You’d receive some answers, hopefully correct—probably a range of systems—and then you could delve deeper into these systems’ properties using computational chemistry before moving into the lab.

This small data-big data approach, one among many you may envision, could be highly beneficial for industry. It would offer a quick-and-dirty guide for experimentalists stuck in the chemical maze. It could provide data-backed answers, leveraging computational chemistry for validation and further exploration.

But that’s just the tip of the iceberg. The potential to develop machine learning tools through systematic computational chemistry is immense.

Speaking of which, I recently came across this article by Heather Kulik. She’s done some awesome work showing how combining computational chemistry with machine learning is making big waves in material discovery.

So, here’s a list—because why not?—inspired by Kulik’s article about the potential of computational-chemistry-driven machine learning models:

1. Machine Learning Models for the design of new materials

Scan millions of candidates and accelerate discovery of new materials
Automate the screening of high-throughput of transition metal complexes
Robustly predict and optimize the properties of materials by focusing on aspects such as ground state spin, geometrical features, synthesizability, activity, stability, and cost-efficiency
Extract design principles and discover lead complexes through systematic analysis of model chemical features
Facilitate interactive and iterative testing of expert-driven and machine-learning-driven hypotheses, enhanced by visualization and analysis tools
Challenge limitations of human intuition [Kulik and colleagues found that machine learning models could predict properties of transition metal complexes more accurately than semi-empirical methods or their own intuition.]

2. Machine learning to support DFT calculations

Address structural challenges by using machine learning for structures that are hard to model with DFT
Predict DFT-level properties and sensitivity to DFT functional choice
Significantly reduce extensive manual “checks,” while enhancing outcome reliability
Forecast success of DFT calculations and detect potential failures early on

Impressed? I certainly am.

Kulik’s article is just one example, yet it reveals a vast landscape of current and potential opportunities for computational-chemistry-driven machine learning, hinting at a future where integrating these technologies could significantly enhance the field’s efficiency and precision.

Further reading: What’s Left for a Computational Chemist To Do in the Age of Machine Learning? | Kulik 2022

More on From Atoms To Words:
▸ Chemical Space to Material Discovery: Simulations and Machine Learning Leading the Way
▸ Large Language Models for Chemistry: Is the Beginning of a New Era?
▸ 60 Years in the Making: AlphaFold’s Historical Breakthrough in Protein Structure Prediction

Computational chemistry in the era of Machine learning: The Mullet Science

So, it’s safe to say that the integration of machine learning with computational chemistry has already made significant strides. And it’s just the beginning. With more potential on the horizon, this combo is poised to break free from historical limitations and offer insights that might just outdo purely analytical theories.

Let me throw in a super cool example published just a few days ago, albeit from a completely different field: Google Deepmind’s GraphCast. Honestly, this thing is both amazing and scary. It’s a machine learning-based method that outstrips traditional physics-based techniques in global weather forecasting, nailing the prediction of various weather variables with some seriously impressive accuracy.

Could this apply to materials as well? After all, weather is as complex and chaotic as chemistry, isn’t it?

So which is it: fundamental science or useful tool? To me, computational chemistry is a “mullet.”
Heather Kulik, 2022

Frankly, it’s tough to argue with Kulik on this one: She sees in the merger between computational chemistry and machine learning the marking a new chapter in scientific research. And yet, the rub in this “awkward marriage,” as Kulik calls it, is all about how machine learning and traditional computational chemistry don’t exactly see eye to eye.

Machine learning models kind of march to their own drum, straying from the usual trade-offs between cost and accuracy that you get in computational chemistry. They’re shaking up the belief that piling on more physics always makes a model better, which may cause quite a stir in the old guard.

The big question everyone’s chewing on is how our own biases in computational chemistry might mess with machine learning training, and whether we even need machine learning when some of our current methods are doing a decent job of keeping accuracy and cost in check.

Now, what I find particularly intriguing in Kulik’s perspective is how this evolving landscape poses a critical question for future scientific research: Is computational chemistry a fundamental science, or merely a useful tool?

In truth, it’s both.

Kulik has this quirky way of describing computational chemistry – she calls it a “mullet.” Fundamental science at the front, driving the “party in the back,” where it becomes a widely used tool impacting all other branches of chemistry.

So, the big question is: can machine learning tap into computational chemistry and drive this trend further?

Kulik’s convinced: machine learning is the game-changer here.

A Final Personal Touch

Alright, as usual we’ve gone through a whole lot. So, what’s the takeaway for us computational chemists? Should we start sweating about machine learning taking over?

It’s all about how you look at it. No need to freak out, folks. As long as we don’t bury our heads in the sand of old-school thinking and are willing to embrace the new wave, machine learning is the next big adventure.

In any creative field – be it the arts, music, or even science – machine learning, at the end of the day, is just a tool, right? A really powerful one that, if we handle it smartly and thoughtfully, could open up a world of possibilities, in computational chemistry included.

So, what can we expect in the coming years?

Well, trying to guess where the marriage of computational chemistry and machine learning will take us is kind of like gazing into a crystal ball. But hey, look at how far we’ve come. Computational chemistry has evolved from a fuzzy theoretical concept into a powerhouse driving real change in various chemistry sectors.

To me the future of computational-chemistry-driven machine learning looks like this huge, untamed wilderness out there waiting for the brave to explore.

So, let’s not just sit back. Let’s charge into the future. Let’s be those brave explorers.

But please, wear no mullet.

If you enjoyed this dive into the awkward marriage of machine learning and computational chemistry, I’d love to hear your thoughts. Agree, disagree, or have a totally wild theory of your own? Let’s connect! Subscribe to my LinkedIn newsletter and let’s keep the conversation rolling.