Skip To Content
Back to Events
Conference

SMASH – Small Molecule NMR Conference

Come by our booth and poster presentations at SMASH.

We also invite you to come together with fellow experts in the NMR community for a day of discovery and lively discussion beforehand at our Porto office. Register for the Digitalization & Databasing for Better Decision Making—Driving Innovation in NMR Workflows Summit now.

Poster Presentation

Leveraging Public Knowledge For Fragment Identification In CASE

Monday, Sept 22nd, 2025

16:00 – 17:30

Poster Number:56
Room: Ariane, Gemini & Discovery

Dimitris Argyropoulos, NMR Business Manager; ACD/Labs

Dimitris Argyropoulos, Sergey Golotvin, Maxim Kisko, and Mikhail Elyashberg

ACD/Labs, Toronto, ON, Canada

 

Over the last several decades, computer assisted structure elucidation (CASE) has become a proven method [1] for the unambiguous determination of new chemical structures from NMR data paired with a molecular formula. CASE offers many benefits over manual elucidation, including efficiency and the elimination of human bias. However, despite significant advances in computing power over its lifetime, it still takes time for CASE to generate and assess all the possible structures. In many cases, this is only a few seconds to minutes. However, when NMR information is ambiguous or limited, elucidations can take hours or even days to complete.

Previously [2], we have shown that identifying fragments as small as a single aromatic ring prior to the CASE structure generation step can dramatically reduce the total elucidation time. To investigate broadening the scope to other fragments, we considered the use of fragment databases; however, there are no such libraries that also contain corresponding NMR data. Furthermore, it would be impossible to generate this data either experimentally or computationally without information about the rest of the compound, which could have an important influence on the chemical shifts observed.

With a library of complete structures on the other hand, we can predict the corresponding spectra and search this database to identify known fragments of unknown structures prior to structure generation in CASE.

We explored this possibility using the database of PubChem compounds and their predicted 13C NMR spectra used for dereplication in ACD/Labs’ Structure Elucidator Suite. Performing a spectral search with the experimental data of the unknown structure against this database using expanded search terms and tolerances allowed us to identify PubChem compounds that have a subset of carbon atoms with similar chemical shifts.

Incorporating the information from the identified fragments and the corresponding chemical shifts in the subsequent creation of the Molecular Connectivity Diagram provides more constraints for the structure generation task. This results in significant acceleration of the overall elucidation, as will be illustrated with several examples.

  1. E. Elyashberg, A.J. Williams. “Computer-based Structure Elucidation from Spectral Data. The Art of Solving Problems”, Springer, Heidelberg, 2015, 454 p
  2. Bourque, S. Golotvin, M. Kisko, R. Pol and D. Argyropoulos, “Reducing the Computational Burden of Structure Generation in Comoputer Assisted Structure Elucidation (CASE), poster presented at SMASH 2024, Burlington, VT, USA.
Learn More

Poster Collaboration with Stanford University

Accurate And Efficient Structure Elucidation From NMR Using Artificial Intelligence

Monday, Sept 22nd, 2025

16:00 – 17:30

Poster Number:72
Room: Ariane, Gemini & Discovery

Frank Hu1, Dimitris Argyropoulos2, Mikhail Elyashberg2, Sergey Golotvin2, Matthew W. Kanan1, Grant M. Rotskoff1, Thomas E. Markland1

  1. Department of Chemistry, Stanford University, Palo Alto, CA, USA
  2. ACD/Labs, Toronto, ON, Canada

Determination of molecular structures is a key bottleneck in chemical research. Acceleration of this task can greatly impact the efficiency of workflows across many chemical disciplines. Furthermore, it can be paired with automated synthetic workflows to create closed-loop discovery platforms. However, elucidating a chemical structure using only one-dimensional (1D) NMR spectra, the most readily accessible data, remains an extremely challenging problem.

This is largely due to the combinatorial explosion of the number of possible molecules as the number of constituent atoms is increased: for molecules with only 21 heavy atoms, there are over 20 trillion possibilities consistent with the bonding rules of chemistry. So far, this has been quite successfully addressed using 2D correlation NMR spectra in techniques like computer assisted structure elucidation (CASE). [1] However, this technique relies on the availability of sufficient 2D NMR data and suffers from long elucidation times when there are ambiguities in the observed spectra.

Here we address this challenge by introducing a multitask machine learning framework capable of performing end-to-end structure elucidation. This system predicts the molecular structure of an unknown compound based on its 1D 1H and/or 13C NMR spectra without relying on any prior chemical knowledge, including the molecular formula, which would significantly increase the strength of the suggested approach. [2]

Leveraging developments from the field of natural language processing and a powerful chemically-inspired pre-training protocol, we break the combinatorial scaling, unlocking accurate and efficient structure elucidation across a range of molecular sizes. Furthermore, additional information, such as molecular fragmentation, can be simultaneously recovered with high accuracy, providing additional avenues for guiding structure elucidation.

Our platform opens new possibilities in the field of ML-driven structure elucidation by introducing a fast and efficient structure elucidation framework that can operate in an unsupervised manner without relying on additional knowledge. We will detail the training of the framework as well as examples of its use in elucidating chemical structures of varying complexity.

  1. E. Elyashberg, A.J. Williams. “Computer-based Structure Elucidation from Spectral Data. The Art of Solving Problems”, Springer, Heidelberg, 2015, 454 p
  2. Hu, M. S. Chen, G. M. Rotskoff, M. W. Kanan, and T. E. Markland ACS Cent. Sci., 10, 11, 2162–2170 (2024)
Learn More