How Accurate should NMR Predictions be?

May 28, 2007

by Ryan Sasaki, NMR Product Manager, ACD/Labs

Perfect, of course.

But I think we can all realize that it’s not realistic for NMR predictions to be perfect. Chemistry and nature just has too much structural diversity to throw at us. Furthermore, even when the exact structure is in the prediction database, the result may not be perfect. Effects such as solvent, temperature, and different experimental conditions will likely results in different results.

Wolfgang Robien pointed out several errors in the NMRShiftDB (an open source collection of chemical structures and their associated NMR shift assignments).

Our most recent findings shows that a significant increase in accuracy is observed when using ACD/CNMR Predictor over ~~NMRPredict provided by Modgraph Consultants, Ltd~~.Robien’s CSEARCH algorithm

During the compilation of NMR data to add to our prediction database, our database team find about 8% errors in the form of mis-assignments, transcription errors, and incorrect structures within the peer-reviewed literature they comb. Unfortunately, no method is fool-proof.

I posed this question specifically to a group of some of our most sophisticated users at our European User’s Meeting in Obernai, France last year. I was somewhat surprised by the overwhelming response that we don’t need to work on our ¹³C NMR predictions anymore. The consensus was, that we had achieved an accuracy for ¹³C chemical shift prediction that was good enough. At the time we were reporting a prediction accuracy of 2.26 ppm for Version 10 , and have since achieved a standard error of 1.84 ppm via our internal leave-one-out analysis of our entire CNMR database (2 million chemical shifts).

Now as much as we appreciate their advice, we will continue an attempt to improve the algorithms and database contents for the CNMR predictor.

However, perhaps we should spend more of our resources on ¹H NMR prediction. Again, how accurate does this need to be? Our customers at the Obernai meeting all agreed that we need to improve in this regard. It is much tougher to predict ¹H NMR chemical shifts and there is certainly room for improvement. Our latest internal leave-one-out evaluation of our entire HNMR database (over 1.5 million ¹H chemical shifts) revealed a standard error of 0.22 ppm.

So again, how accurate do these predictions need to be? I’d agree, they need to be better than this standard error. We’ve made incredible strides over the last number of years, but it also needs to be understood that improvements are going to get tougher and tougher. Decreasing the standard error by an additional 0.01 ppm becomes exponentially more difficult.

So in a response to my own original question, I think that CNMR predictions need to exhibit a standard error of better than 2.0 ppm over a large, chemically diverse dataset (without significant database overlap). We believe we have met this benchmark, but will continue to improve. For HNMR predictions, I would think a standard error of 0.2 ppm would be acceptable.

Are these numbers perfect? Of course not. But they are closer than you think.

I believe that other commercial and research NMR prediction packages should set a goal to achieve these benchmarks and we should all of course strive for better.

Do you agree?

Feel free to add your own expectations in the comments section.

EDIT: This conversation has continued in the following entries (in order):

http://acdlabs.typepad.com/my_weblog/2007/05/update_robien_o.html

http://acdlabs.typepad.com/my_weblog/2007/05/more_dialogue_o.html

http://acdlabs.typepad.com/my_weblog/2007/06/robiens_and_mod.html

http://acdlabs.typepad.com/my_weblog/2007/06/note-from-an-nm.html

http://acdlabs.typepad.com/my_weblog/2007/06/the_purgatory_d.html

http://acdlabs.typepad.com/my_weblog/2007/07/final-note-on-t.html