Research Project
Chemistry is entering a new stage, where discovery is driven not only by experiments, but also by algorithms able to extract patterns from spectra, microscopy images, reaction data, and molecular structures. Our AI in Chemistry project explores this transition in a practical way: we develop methods that help chemists discover reactions faster, read complex analytical data more deeply, understand catalysts at unprecedented resolution, and connect molecular structure with function, toxicity, and material behavior. The goal is not to replace the chemist, but to build powerful digital tools that extend chemical intuition and make advanced analysis available to more researchers and students.
For students, this project shows that chemistry is becoming a field where experiments, coding, data science, and mechanism can work together in one research program. For researchers, it demonstrates that AI is most powerful when it is tightly connected to real chemical questions: discovering overlooked reactions, explaining catalyst behavior, decoding spectra, reading complex images, and making chemistry more informative, reproducible, and sustainable.
The broader vision is that digital chemistry is not a side topic. It is becoming a practical research methodology for the next generation of chemical science.
One of the central ideas of the project is simple but powerful: chemistry already contains vast amounts of hidden knowledge in experimental data that were recorded but never fully interpreted. We develop machine-learning tools that help transform this “sleeping data" into new chemical insight.
Our recent study introduced a digital co-expert for reaction discovery. Instead of screening thousands of possibilities manually, the workflow generates candidate reactions, filters them computationally, clusters them with unsupervised machine learning, and then passes a focused set to expert evaluation. In this way, the most time-consuming stage of discovery was reduced by about 180-fold, from more than 1200 days of expert screening to about 7 days, leading to experimentally confirmed new cycloaddition reactions ( 10.1002/anie.202523905).
A complementary direction appeared in Nature Communications, where we showed that high-resolution mass spectrometry archives can be turned into a search space for discovery. Instead of running new experiments first, our machine-learning-powered engine searches tera-scale HRMS data to detect unknown reaction products and overlooked pathways. This work introduced the concept of “experimentation in the past" : using already existing experimental data as a discovery platform for new chemistry ( 10.1038/s41467-025-56905-8).
To evaluate measured/analyzed data amounts in chemitry we have carried out a dedicted study, which revelad about 90% of lost data in exeperimetal chemitry, in terms of data being recorded but not being published in peer-reviewed literature ( 10.3390/chemistry7050160).
Catalysis is one of the most challenging areas for data-driven chemistry because real catalytic systems are dynamic, heterogeneous, and difficult to observe in full. A major part of our project is devoted to building AI-assisted approaches that make catalyst behavior more explicit and more measurable.
Our recent project established the concept of Totally Defined Nanocatalysis. By combining nanomanipulation, electron microscopy, and neural-network analysis, we characterized individual Pd/C catalyst particles rather than treating the catalyst as a vague average. This revealed extraordinarily high performance hidden at the single-particle level, with turnover numbers reaching the order of 109 (10.1021/jacs.2c01283).
The next step, developed the idea into 4D catalysis: not only locating catalytic centers in space, but tracking how they change over time. This work showed that in Pd/C cross-coupling systems, monoatomic palladium centers — although representing only a small fraction of the total metal — can account for the overwhelming majority of catalytic activity. The study linked catalysis to dynamic transformations between nanoparticles, clusters, and single atoms, supported by AI-based image analysis ( 10.1021/jacs.3c00645).
These ideas were further systematized in a viewpoint on 4D catalysis, which explains why following the same catalyst region before and after reaction is essential, and how machine learning enables researchers to detect subtle structural changes that manual analysis would miss ( 10.1021/acscatal.3c03889). Together, these studies move catalysis away from static snapshots and toward dynamic, data-rich mechanistic understanding.
Modern chemistry produces more analytical data than any researcher can interpret manually. We therefore develop tools that make complex spectra computationally readable.
In 2022 we presented MEDUSA, a framework for fully automated unconstrained analysis of high-resolution mass spectrometry data. The method combines gradient-boosted decision trees and neural networks to reduce spectral complexity and infer molecular formulas from fine isotopic structure. The broader message is important: instead of using AI only for classification, it can be used to solve inverse analytical problems that were long considered too difficult for routine practice ( 10.1021/jacs.2c03631).
Another branch of this effort was developed for NMR spectroscopy, where machine learning was applied to 195Pt NMR prediction. The workflow links semiempirical modeling with ML to estimate chemical shifts for water-soluble platinum complexes. This is especially important for catalysis and medicinal chemistry, where rapid interpretation of metal-centered spectra can accelerate mechanism studies and compound design ( 10.1002/cphc.202200940).
A distinctive feature of the project is the use of AI not only on tables and spectra, but also on chemical images.
In 2024, we showed that deep learning can recognize the molecular identity of closely related phosphonium salts from the visual appearance of the material itself. This is a striking step toward connecting molecular structure with micro- and nanomorphology, and it suggests that microscopy can become an information-rich analytical source rather than only a descriptive technique ( 10.1002/smll.202403423).
Earlier, we developed AI pipelines for real-time electron microscopy video analysis of ionic liquid/water systems. These works showed how neural networks can quantify dynamic microphase behavior in soft matter and analyze video streams that would be impractical to process manually ( 10.1002/smll.202007726; 10.1016/j.molliq.2023.121407).
Related studies used explainable AI and deep learning to interpret nanoparticle ordering and identify hidden defects in carbon materials — an approach relevant to catalysis, electronics, and materials diagnostics ( 10.1039/D0SC05696K; 10.1039/d4nr00952e).
The AI in Chemistry project also expands into biological imaging, toxicology, and greener chemical design.
Our 2025 article described developed deep generative models for creating synthetic annotated biofilm images, making it possible to generate training data for segmentation and detection models even when manually labeled data are scarce. This is especially valuable for automated microscopy and biofilm analysis ( 10.1038/s41522-025-00647-4).
For a general digital biology framework, we combined automated SEM with deep learning for macroscale biofilm studies, showing how AI can quantify growth, cell coverage, and biocide effects much faster than manual analysis ( 10.1039/D3DD00048F).
For safer chemistry, we created data resources and online tools such as ILToxDB and Build-a-Bio-Strip, which help organize cytotoxicity data and evaluate the toxicity contribution of reaction components. These efforts connect AI, databases, and practical decision-making in sustainable synthesis ( 10.1021/acs.estlett.5c00860; 10.1021/acs.jcim.4c01381; 10.1038/s41597-024-04190-3).
Full list of publications: link

Artificial Intelligence in Catalysis: Experimental and Computational Methodologies
Editor(s):Valentine P. Ananikov, Mikhail V. Polynski
Print ISBN:9783527353859 |Online ISBN:9783527847068 |
DOI:10.1002/9783527847068
2025 WILEY-VCH GmbH, Weinheim, Germany.