I've spent a lot of time over the past couple of years visiting both scientists and directors in Pharma and having discussions around the concept of automated structure verification, whether or not there is enough characterization being done on screens, libraries, purchased compounds, etc. Whether the possibility exists to do more characterizations and whether companies have the personnel to do this.
Derek's post entitled, the Infinitely Active Impurity doesn't necessarily capture all the feedback I have gotten, but it certainly does bring up some pretty solid arguments, and some of the comments are very interesting. Here are some excerpts from Derek's post:
Yesterday's post touched on something that all experienced drug discovery people have been through: the compound that works – until a new batch is made. Then it doesn't work so well. What to do? You have a fork in the road here: one route is labeled "Blame the Assay" and the other one is "Blame the Compound". Neither can be ruled out at first, but the second alternative is easier to check out, thanks to modern analytical chemistry. A clean (or at least identical) LC/MS, a good NMR, even (gasp!) elemental analysis – all these can reassure you that the compound itself hasn't changed.
But sometimes it has. In my experience, the biggest mistake is to not fully characterize the original batch, particularly if it's a purchased compound, or if it comes from the dusty recesses of the archive. You really, really want to do an analytical check on these things. Labels can be mistaken, purity can be overestimated, compounds can decompose.
And I've seen plenty of things that have fallen apart on storage, and several commercial compounds that were clean as could be, but whose identity had no relation to what was on their labels (or their invoices for payment, dang it all). Always check, and always do that first.
So why aren't these original batches fully characterized? Cost? Acquisition times? A human required to prep the sample? To interpret the results?
And some comments from Derek's loyal readership who share interesting stories, experiences, and practices:
In yet another program, we were doing an SAR study, and one particular compound didn't really fit the SAR the way we had predicted. Initially we took it in stride, and had all sorts of hand-waving arguments of why that compound might not fit the SAR. With time it became more obvious that something was rotten in Denmark – and a close look at the NMR spectrum revealed that in fact it wasn't the compound depicted, but a regioisomer.
Its probably less likely to occur these days I think due to modern tools and better upfront characterizations as suggested but I have heard of similar stories throughout my career and was involved in a several programs where the activity was from either from a minor by-product or a misidentified structure. The by-product was more puzzling at first but did lead to interesting series of analogs (before failing tox screen unfortunately). The misidentified compounds came from a wrong (mislabeled) regioisomer by supplier (so heed Derek's advice to pre-check purchased material as in this case was only obvious after a 2nd bottle gave a slightly different spectra) and another was an unanticipated chemical rearrangement during the synthesis (again only obvious after 2nd round gave less rearrangement). Although frustrating at the time these events were some of the most interesting times in the lab.
The other example came from a compound being contaminated in the Genevac. It's always a problem with shared equipment that many people don't think about. My own rule now on this is to always have LCMS and NMR (so many vendors only do LCMS) and if you get a good hit make a batch 2 on a bigger scale so you can really look at what's in there.
Anyhow, I've seen all of the above mentioned scenarios played out at some time or another. My sense is that the cost of a compound (i.e. how pure?, how sure?) should parallel the cost of the biology being done on it. You don't need EA, NMR, LC/MS, IR on each of 250,000 compounds going into a high throughput screen. And you better be mighty sure what you're putting into a dog tox study or into the clinic. If you get a high throughput hit, you need to confirm (more $) before you move forward very far (more $). There's nothing wrong with occasionally screening garbage if you don't act stupid after it hits.
My experience is that, at most companies, discovery research needs to spruce up its sloppy attitude to characterisation. (I dread to think what happens in academia, where we used to wait weeks for MS because the people were too busy doing *research* to be *bothered* with mundane structure confirmations). The approach that works in most cases has already been hinted at; synthesise batch 2, for preference by a different chemist, on a gram scale, and fully analyse (tlc and hplc) and characterise – Yes! Use C-nmr to check that all the carbons are there (!), and that their 1-bond and 3-bond couplings are logical. This is also an excellent nmr learning exercise for new BSc chemists.
The overall theme of this post, as well as the comments are that more diligent characterization is needed, even in high volumes.
This outlines one of the issues in this industry that we are trying to address through automated structure verification. But the argument for the system is of course that organizations do not have the personnel to diligently check the identity of the structure against analytical data from all different sources. An automated QC of these results can help address a good chunk of these concerns.
Of course one of the major concerns that comes up, is whether or not we can completley eliminate false positives. The answer of course right now is no. And of course it will likely always be no. Sometimes there are just really challenging problems.
Of course, nothing beats the diligent manual analysis by an analytical expert, but we all know that there is the time/money dilemma which is inevitable, and it's the reason why many organizations are unable to practice the due diligence that Derek and some of his readers are advocating.