Application Note

Building a Foundation for Autonomous Structure Verification with ASV

Automated Structure Verification at AstraZeneca

Across AstraZeneca (AZ), the journey towards automating NMR analysis workflow began two decades ago when they started digitalizing their NMR data. Since then, they have continued to pave the way towards their ultimate automation goals—such as a fully autonomous structure verification system— by improving and expanding upon these digitalization efforts.

2004
Digitized NMR Spectra

NMR spectra exist as scanned images or PDF files of data processed and analyzed manually.

Hard to find desired spectra
Limited audit trail/file info/metadata
No chemical shifts or J-couplings
Can only do visual comparison (no changes or expansion, poor quality)

2014
Digitalized Live Analytical Data
(for some)

Spectrus Platform applications rolled out across global pharmaceutical development.

Live data stored in a searchable database
Full parameter details, with metadata (chemical shifts, J-couplings, integrals, etc.)
Interactivity

2024
Global Analytical Database
(GAD)

Oncology R&D and Biopharma R&D discovery teams share searchable database of live analytical data.

Automatically captures all raw NMR and LC/MS data
Chemistry across the company now has a FAIR small molecule analytical database

Amber Balasz
Director of US Analytical, Structural, and Chromatography Team & NMR Specialist
AstraZeneca Oncology R&D
Boston, USA

Richard Lewis
Principal Scientist
AstraZeneca Biopharma R&D
Gothenburg, Sweden

Produced by ACD/Labs based on talks given by Amber Balazs (April 7^th, 2024 at the ACD/Labs Software Symposium at the 65^th Annual Experimental Nuclear Magnetic Resonance Conference) and Richard Lewis (October 24^th, 2024 at ACD/Labs’ Structure Elucidation and Verification Virtual Symposium).

Accelerating Structure Verification without Compromising Accuracy

In AstraZeneca’s Oncology and Biopharma R&D organizations, there is an effort to accelerate the make phase of the design-make-test-analyze (DMTA) cycle. To help achieve this, they were particularly interested in automating their post-purification workflow.

In this workflow, once analytical data is acquired, chemists must verify the proposed structure before they can register the compound. At the Gothenburg site alone, chemists are verifying the structures of hundreds of compounds per week. Additionally, as more of their chemistry becomes automated, the amount of data chemists must deal with continues to increase.

While there is a substantial need to accelerate structure verification, this cannot come at the expense of accuracy, as having an incorrect structure misleads design teams, which is bad for both the organization and patients.

Keeping Fully Autonomous Structure Verification in Sight

The short-term goal in implementing automated structure verification (ASV) with NMR Workbook Suite™ was to accelerate their structure verification workflow and reduce the burden of structure verification on chemists. In the long term, AZ’s goal was to eliminate the need for a human to spend time verifying known structures altogether. This means they are working to build a system that they can rely on to make decisions from analytical data. Relying on this system means that it either needs to produce accurate results 100% of the time, or that it can be wrong occasionally, but it needs to be able to identify these cases, prompting for more data and/or human interpretation.

Optimizing ASV Performance

With both shorter- and longer-term goals in mind, AZ are investigating ways to further optimize the accuracy of their ASV system (i.e., minimize the number of false results).

One way to increase accuracy is to add more analytical data. However, they want it to work in high-throughput settings, so this is not realistic. While they use ASV in conjunction with MS data to confirm molecular formula, this doesn’t help ASV with the difficult task of distinguishing between structural isomers, which is important when analyzing reaction products. So, AZ recently undertook in-house performance testing to investigate other changes to the system or input data that could help them get closer to their goal. (Table 1)

Table 1. Summarized results of in-house ASV optimization experiments.

ASV Factor	Results and Conclusions
Input Data	In isolation, ¹³C data provided best accuracy, compared to ¹H or HSQC data, possibly because of better resolution, more robust predictions due to broader shift range of ¹³C vs. ¹H spectra, and because ¹³C shifts are less influenced by 3D conformational effects compared to ¹H. While none of their tests in this area produced sufficient accuracy for an autonomous system, it did lead them to conclude that they should always include HSQC data and add additional weighting biases to ¹³C shift assignments.
Single Structure Verification (SSV) vs. Combined and Concurrent Structure Verification (CCV)	Because they are focused on distinguishing isobaric structural isomers resulting from synthetic reactions, they tried to approach things more like a human would. Instead of looking at a single structure and trying to evaluate how well it corresponds to the data with nothing else to compare it to, they looked at how accurate the ASV system was at distinguishing pairs of compounds using the differences in the match factor (MF) score from ASV. They found that when using HSQC or ¹³C data in this way, the system approached the level of accuracy required for an autonomous system.
Peak Picking Mode	MFs improved for automatic peak picking as more data was included, but manual peak picking outperformed automatic peak picking, regardless of what data was included in the dataset. However, as it is not practical to manually pick peaks in a high-throughput environment, this underscored the importance of optimizing the advanced parameter settings to allow the system to better accommodate a wider variety of projects/circumstances without significantly increasing risk.
Prediction Algorithm	Using the neural network or Hierarchically Ordered Spherical Environment (HOSE) code algorithm provided similar performance.

An ASV System for Current and Future Goals

Equipped with these insights, AZ began an ASV system pilot across their global discovery organization in December 2023. (Figure 1) This system automatically:

Pulls in a 1D ¹H spectrum and a carbon-edited HSQC from the GAD
Processes and analyzes the spectral data
Creates a customized review-ready report that ranks a Chemformer-generated set of chemically-probable predicted reaction products

Figure 1. Post-purification structure verification workflow at AstraZeneca.

Chemists review the report and can interactively adjust anything they want in the live data. They can even add or subtract structures from the verification set. This allows them to start analysis at an “edit & review” mindset instead of starting at the beginning with raw data, which accelerates this step of the workflow without increasing the risk of misleading downstream work with incorrect structures.

“Working together, we can implement some automation now while continuing to improve in the future.” – Amber Balazs

The Future of ASV is Bright

AstraZeneca is excited about the future of ASV. They are focused on the next steps towards an autonomous system for structure verification, like implementing the ability for their ASV system to make “smart” suggestions to and decisions for the chemist. They believe the key to advancing towards their long-term goals is using all the information available (e.g., synthetic, analytical data) and believe the future state of ASV likely mixes approaches. So, in the meantime, enabled by their previous digitalization work, they continue to expand their in-house testing and optimization with this in mind and are currently receiving promising results from their investigations of incorporating other kinds of analytical data, such as IR.

Download the application note to read offline.

Download Application Note

Other Resources

Case Study

Automated Verification of Small Molecule Structures at Sanofi

Sanofi’s implemented automated structure verification (ASV) to streamline ¹H NMR-based molecule validation, reducing repetitive tasks and improving efficiency. Learn more about their testing that revealed how LIMS integration, prediction database updates, and LC/MS integration, ensures reliable verification, enhancing chemists' workflow.

Case Study

Automated Structure Verification for High-Throughput Quality Control in R&D

Scientists at Syngenta and Amgen rely on Automated Structure Verification (ASV) to enable high-throughput NMR quality control workflows. Explore their process from optimizing dataset selection to reducing analysis time, improving data confidence, and accelerating decision-making with ASV.

Case Study

Accelerating Structure Verification Across Novartis

Novartis implemented automated structure verification (ASV) across open access NMR instruments, reducing data analysis time for chemists by up to 90%. Learn how they are continuing to improve accuracy and how the system is helping one pharmaceutical development team focus on complex elucidations that require expert insight.

Application Note

Building a Foundation for Autonomous Structure Verification with ASV

Automated Structure Verification at AstraZeneca

Accelerating Structure Verification without Compromising Accuracy

Keeping Fully Autonomous Structure Verification in Sight

Optimizing ASV Performance

An ASV System for Current and Future Goals

“Working together, we can implement some automation now while continuing to improve in the future.” – Amber Balazs

The Future of ASV is Bright

Other Resources

Automated Verification of Small Molecule Structures at Sanofi

Automated Structure Verification for High-Throughput Quality Control in R&D

Accelerating Structure Verification Across Novartis

Send me more info!