Skip To Content

The Balancing Act of False Postives vs. False Negatives

A few months back, I referenced Derek Lowe’s excellent blog, "In the Pipeline"

Some of the most interesting entries for me are:

Backtracking, Necessary, and Unnecessary

I Can Has Ugly Molecules?

Oops.

One of the take home messages for me, is when Derek says, "False Negatives and False Positives are waiting in your dataset, depend on it"

I think this is a point that is largely acknowledged in the industry. For example, I think it is pretty obvious to most organizations that there are a significant amount of incorrect structures in their registration database. Some will have a pretty good idea of how much, and others will have no idea.

I think the key point however, is to acknowledge it.

Now this brings me to the topic of the balancing act between false positives and false negatives. If I go back to the application of validation compound registrations with automated NMR verification. This application is one I have blogged about quite a bit over the last year and a half.

The key point to emphasize here, is that this application is NOT replacing the chemists analysis and QC

Chemists are still looking at their data and registering their compounds the normal way. But is an automated validation step necessary after the fact to ensure the quality of the registration database is accurate?

I asked the question from an earlier post, "what is the acceptable limit of false positives and false negatives for
automated verification by software for the evaluation of  registered
compounds in a library?"

I didn’t get any comments on the blog, but I got a few email responses and I have discussed this with some industry people over the last few months. Not surprisingly, the results vary significantly.

Of course everyone would prefer a 0% False Positive Rate, and a 0% False Negative Rate. Yes, in a perfect world that would be great. But it’s not going to happen. Not ever. So decisions need to be made about how important this this exercise is, and furthermore what is deemed acceptable?

If we acknowledge that there are incorrect structures in the registration database, then the question is how much, and furthermore, how many can be removed with the use of some automation.

I think the challenge that most organizations face today when evaluating such a system is trying to balance out to sides, false positives vs. false negatives. Why?

My understanding is that if you have a system that is false positive tolerant, well, you are sacrificing the overall quality of your organizations registration database. On the other side of the coin, if you have a system that is false negative tolerant, well, someone needs to look at the data manually to see if this is indeed the right or wrong structure. So there are two forces pulling here. One, is actually knowing that despite your efforts, you are still letting incorrect structures in your database. The other force is knowing that you need a real person’s time to spend on manually evaluating a bunch of spectra.

Most spectroscopists I know, do not have a lot of time for this. So of course it would be preferable to have a system that is potentially more false positive tolerant. However, the whole point of implementing the system in the first place was to identify the false positives.

Tough decision…

But at the end of the day, I believe there are systems and applications out there that can help drastically improve the quality of your registration databases.

However, the balancing act between false positives and false negatives is an issue that you are most certainly going to have to juggle.

3 Replies to “The Balancing Act of False Postives vs. False Negatives”

  1. I agree that this is a balancing act, and it is up to the organization to decide what workload they can absorb at the expense of improving the quality of the database. We have recently tried a mathematical approach to determine how to set our thresholds for scoring that will determine what fails and passes. The method we have looked at involves setting up a 2 dimensional matrix where you analyze how many false positives vs. false negatives that you generate when you adjust this two-point/three-scale(red,yellow,green) approach. By utilizing the x and y axis which are the score ranges of 0 to 100, and generating a surface plot which is the number of wrong results (i.e. both false positives and negatives in a fixed set number of compounds given the intersection of pos/neg threshold) you can find a maxima and minima since auto verification is not perfect and not linear. So now, instead of deciding arbitrarily whether we are false negative or positive tolerant … we say we set this where the overal outcome is the least incorrect. This is more acceptable when a system performs well. Setting a threshold for “ambiguous” or rather yellow-light results is a little trickier, and we are still working on this, but at the moment we have a surface where we see a change in inflection, or rather a change in the rate of resulting compounds that are assigned as ambiguous that we use to decide where this threshiold is.

  2. Hi Ryan and Phil. I also wanted to bring something to your reader’s attention that both of you kind of glossed over. This is that balance of False Negatives and False Positives can also be balanced against the Time/Money you have to throw at the problem. If you are REALLY time crunched in your lab today, you may be willing to put up with more of these false results in order to push more compounds through an automated system.
    I know of some situations where users don’t have time to check the accuracy of a structure against the analytical data. In this case, almost any amount of False Positives or False Negatives would be acceptable if the data was now getting evaluated at least.
    Maybe this was obvious, but I thought it was worth mentioning.

Comments

Your email address will not be published.