ACD/Solubility DB
Comparison of ACD/Solubility DB with Experimental Data
Continuous improvement of the performance and accuracy of our predictors requires constant validation; at ACD/Labs we understand the importance of good validations. Experimental data provided in a recent publication1 gave us the opportunity to evaluate our solubility algorithm (version 11) against a dataset of 1125* diverse small compounds. Statistical analysis of this data was carried out and the results are reported below.
LogS (v11) vs. Experimental Measurement
Scatter plots showing results of ACD/Solubility DB prediction accuracy vs. experimental values.
N = 1125 R2 = 0.89 Av. Error = 0.47 Std. Error = 0.67
To review the performance of ACD/Solubility DB, v11, when binning this solubility data (for soluble, partially soluble, and insoluble compounds) click here.
*Note: This data had previously been reported for v10 on a dataset of 1144 compounds provided as an SDfile by authors of the publication. Close examination of the SDfile, however, revealed duplicate compounds (an average was used when experimental values differed for the same structure), and entries with ambiguous names and identical SMILES strings. Isomers (cis/trans) were also removed to avoid biasing of results since the software treats structures as 2D.These 19 compounds were removed from the dataset to leave 1125.
References
- J. S. Delaney, J. Chem. Inf. Comput. Sci., 2004, 44, 1000-1005.
This comparison of data is by no means a comprehensive study of solubility and accuracy of prediction; rather it is an internal benchmark for ACD/Labs to evaluate the evolution of the product, and a study of the enhancement of the algorithm. Our clients have ranging accuracy needs based on their applications, and are always encouraged to test their data when contemplating deployment options.
|