Computer-Assisted Structure Elucidation in Routine Analysis

The elucidation of unknown structures, especially those with novel moieties found in natural products, often results in initially incorrect published structures,1 which then require either exhaustive spectroscopic analysis, full chemical synthesis or both to prove the correct structure. In many cases, the initial incorrect structure and subsequent analytical work could be avoided using computer-assisted structure elucidation (CASE).

Novel chemical structures benefit from completely unbiased evaluation of the experimental data and fit of the proposed structure. An unbiased evaluation is one in which all possible chemical structures are evaluated against the experimental data and are then ranked.

Sinensilactam A (Figure 1) is a meroterpenoidal hybrid metabolite isolated by Luo et al.2 from a lingzhi mushroom species found in the eastern and southern regions of China. Many earlier studies of the G. sinensis species revealed that it contains polysaccharides, triterpenoids, alkaloids, fatty acids, nucleotides, proteins, peptides, trace elements, sterols and ganosinensins A–C, the latter of which are hybrids of a triterpenoid and a prenylated phenol. Despite these observations, the presence of meroterpenoids in this species remained largely unknown. Through a detailed spectroscopic analysis, the authors deduced that the structure contained a rare 2H-pyrrolo[2,1-b][1,3]-oxazin-6(7H)-ring system, which was confirmed through X-ray crystallographic analysis.

Figure 1 – Structure of Sinensilactam A.

CASE was proposed as a way to quantify how well the novel structure of Sinensilactam A fits with the experimental data. For structure determination, the molecular formula and reported NMR data (1H, 13C, correlation spectroscopy [COSY] and heteronuclear multiple-bond correlation spectroscopy [HMBC] NMR) were input into the ACD/Structure Elucidator Suite (ACD/Labs, Toronto, Ont., Canada). A molecular connectivity diagram (MCD) was then created based on this data, a slightly edited version of which is presented in Figure 2 (adjusted in accordance with characteristic chemical shifts).

Figure 2 – Slightly edited molecular connectivity diagram for Sinensilactam A.

Carbons 69.20, 75.90, 77.10, 84.50 and 201.60 were automatically supplied with the label “ob” (obligatory), which means that a heteroatom should exist in the first sphere of a carbon atom environment. Taking into account the number of heteroatoms (9), the same label was also manually assigned to atoms C 172.60 and C 181.40. The four sp2 -hybridized carbons with 13C chemical shifts in the interval between 114.60 and 124.3 ppm were labeled “fb” (neighbor heteroatom is forbidden). The number of hydrogens attached to the carbon atoms located in the first sphere of environment of definite carbons was set in accordance with 1 H signal multiplicities from the original work.2

The MCD was checked for the presence of contradictions and the software produced the following message: “Current Molecular Connectivity Diagram (MCD) passed all tests. No updates performed.”

As determined by the software, there were no probable nonstandard connectivities (NSCs) in the 2-D NMR data, so strict structure generation3,4 accompanied with 13C spectrum prediction was performed. In 0.7 second, 44 structures were generated but then rejected, as the average deviations were all greater than the 5-ppm limit. This suggested that there were latent NSCs that were undetected by the program.

Fuzzy structure generation3,4 was therefore run with the supposed number of NSCs equal to one, and possible augmentation of the connectivity lengths is one bond. This allows the software to automatically attempt to lengthen the connectivities, generating a wider range of structural possibilities. In just under two minutes, 144 structures were generated, but only two were found to be suitable candidates; these are presented in ranked order in Figure 3.

Figure 3 – Ranked output file for Sinensilactam A.

The first structure in Figure 3 coincides with structure 1 of Sinensilactam A as determined by the authors. The second-ranked structure differs only in its aromatic ring substitution, but is characterized by higher chemical shift deviations, making the first-ranked structure much more likely. Use of fuzzy structure generation thus allowed the software to easily circumvent a latent contradiction that existed in the 2-D NMR data. The structure with 13C chemical shift assignments produced by the program automatically is depicted in Figure 4 (the red arrow shows the nonstandard connectivity detected in the HMBC data).

Figure 4 – Sinensilactam A, with 13C chemical shifts and the nonstandard connectivity shown with a red arrow.

To spectroscopists, this is clearly a strength of a CASE system. Dr. Eugene Mazzola of the FDA Joint Institute for Food Safety and Applied Nutrition at the University of Maryland said, “Having an unbiased check of the data performed by the software can be extremely useful to ensure that all possibilities are explored and considered…. You can never prove you are right, but you can often prove that you are wrong—and it’s better to catch an error than to publish it.”

Figure 5 – Structure of guyanin.

Mazzola has published works with further examples.5 Guyanin, one of the first structures determined solely by NMR techniques, was elucidated in 1986. It was so unusual at the time that X-ray data was also used for complete confirmation. Mazzola and William Reynolds of the University of Toronto reconstructed the original data from their article (Ref. 5) to see if the ACD/Structure Elucidator Suite could determine the structure. Utilizing only HETCOR (direct H-C chemical-shift correlations) and XCORFE (longer-range H-C correlations, which preceded the HMBC experiment), the software generated a single structure, the correct one, in one second. The structure is shown in Figure 5, with both 13C and 1H chemical shifts labeled.

Dr. Mazzola explained, “A CASE program is extremely useful, since it can pick out a structure you wouldn’t have considered at all—which is important even for experienced NMR spectroscopists.”

References

  1. Nicolaou, K.C.; Snyder, S.A. Chasing molecules that were never there: misassigned natural products and the role of chemical synthesis in modern structure elucidation. Angewandte Chemie  2005, 44(7), 1012–44.
  2. Luo, Q.; Tian, L. et. al. (±)-Sinensilactam A, a pair of rare hybrid metabolites with Smad3 phosphorylation inhibition from Ganoderma sinensis. Org. Lett.  2015, 17(6), 1565–8.
  3. Elyashberg, M.E.; Williams, A.J. et al. Contemporary Computer-Assisted Approaches to Molecular Structure Elucidation; RSC Publishing: Cambridge, 2012, 482p.
  4. Elyashberg, M.E.; Williams, A.J. Computer-Based Structure Elucidation from Spectral Data. The Art of Solving Problems; Springer: Heidelberg, 2015, 446p.
  5. Reynolds, W.F.; Mazzola, E.P. Nuclear Magnetic Resonance in the Structural Elucidation of Natural Products. In: Chapter of Progress in the Chemistry of Organic Natural Products; 2014, Vol. 100, 223–309.

Patrick Wheeler is NMR product manager, Steve Hayward is technical marketing specialist and Mikhail Elyashberg is lead software development specialist, Advanced Chemistry Development Inc. (ACD/Labs), 8 King St. E., Ste. 107, Toronto, Ont., M5C 1B5, Canada; tel.: 416-368-3435; e-mail: patrick.wheeler@ acdlabs.com; www.acdlabs.com

Related Products

Comments