May 7, 2018
Sherry Myles, Marketing Systems Coordinator, ACD/Labs
“If you build it, they will come.” While this line, famously adapted from the Kevin Kostner film ‘Field of Dreams’ may seem like a perfect analogy for business, few can afford to build a product and then simply rest on their laurels and expect to continue to reap the rewards. Science, in particular, demands persistent dedication. It is a pursuit that requires us, a company that supports the endeavors of those at the forefront of R&D, to also keep pushing the boundaries.
When ACD/Name, our legacy nomenclature software package available since 1996, first arrived on the chemistry software market, it offered naming in English and was able to generate the correct name for most basic organic structures. Over the 22 years since its initial release, the program has seen almost annual updates and improvements, taking it from a basic program capable of generating the name for rather simple compounds, to a robust program able to generate names for organic, biochemical, and some inorganic structures in 16 languages, in just a fraction of a second per structure.
The very idea of a chemical nomenclature software program is the algorithmic generation of a name based on a predefined set of rules. Names for structures are not retrieved from a database, but rather generated by applying algorithms based on nomenclature rules to the structure. Our well-developed set of procedures, based on years of testing and revisions, are able to produce the correct name for a structure that has never before been synthesized. As scientists push the boundaries of the known world, we too, must keep adapting and growing to serve their needs.
Updates are essential to keep software relevant, and this is truer than ever for today’s world of chemical R&D. We previously wrote about the ‘most correct’ name for a compound, and how developments in the field of chemical compounds have necessitated an update in the very rules designed to classify them. ACD/Name provides chemical naming based on IUPAC nomenclature rules, and supports Index name generation in accordance with CAS nomenclature. What’s the difference? Read on below.
Two nomenclatures: CAS and IUPAC—Proprietary vs. Public
While having a common origin, IUPAC and CAS nomenclatures today have many differences that are noticeable even in simple names. CAS, a division of the American Chemical Society, maintains the “CAS Registry” database containing records for millions of substances, all identified with a CAS Registry Number, CAS Index name, and synonyms.1 While the principles of CAS nomenclature are published and can be implemented in other programs, the details remain proprietary and a CAS Index name is a name generated by CAS itself.
IUPAC, the International Union of Pure and Applied Chemistry, is the “universally-recognized authority on chemical nomenclature and terminology”.2 The IUPAC Recommendations on the Nomenclature of Organic Chemistry (IUPAC Blue Book) provide naming principles for all classes of organic compounds, and continues to evolve with the growing complexity of chemical objects and need for unique and unambiguous names. ACD/Labs produced an electronic version of both the 1979 and 1993 recommendations—incorporating them into the software interface of ACD/Name and hosting them on our web site here. While changes in nomenclature over the years have rendered some of the older principles obsolete, the popularity of this resource, and its usefulness to the chemical community, cannot be denied. The basic principles of IUPAC nomenclature remain unchanged, and accessing these rules allow scientists to interpret older names, which are well-populated in both printed and electronic formats. In the past 6 months alone these pages have been accessed by over 28,000 unique visitors, with an average of 175 people visiting this site per day to research chemical nomenclature rules.
R-188.8.131.52 Hemiacetals. Compounds with the general structure are termed generically “hemiacetals”. Hemiacetals are named substitutively as alkoxy-, aryloxy-, etc., derivatives of an appropriate hydroxy parent compound, such as an alcohol (see R-184.108.40.206), and by functional class nomenclature in the same way as acetals (see R-220.127.116.11) using the class name “hemiacetal”.
Figure 1: Quoted from the IUPAC Nomenclature of Organic Chemistry 1993 Recommendations as reproduced on the ACD/Labs web site.
The most recent edition of the IUPAC Blue book was published in 2014, providing updated guidelines for naming chemical compounds and introducing the concept of Preferred IUPAC Names (PINs). ACD/Name has implemented most of the new principles into the latest version of our software, and work is ongoing in other areas. To date, these recommendations cannot be freely distributed in electronic form but we hope that in future versions of ACD/Name, as well as on our website, we can share the latest version of these principles of IUPAC organic nomenclature with software users and website visitors.
Nomenclature in implementation and development
2019 marks the 25th anniversary of both ACD/Labs as a company, and the development of ACD/Name. Chemical nomenclature is a complex scientific area that is challenging for humans, not least because it is constantly evolving. Even when we think of programming software for this task, the sheer number of nomenclature rules that must be taken into account when developing one set of strict computer algorithms requires as much knowledge of chemical nomenclature as it does of computer programming. We are proud of our growing team of experienced chemists and programmers, whose efforts have produced a naming software that surpasses most human experts in scope, accuracy, and productivity.
In our work with chemical nomenclature we not only implement the published rules, we also detect areas of nomenclature that are not fully defined or lack the necessary criteria to produce a single preferred name. We are happy to have a good history of intensive communication with IUPAC nomenclature experts via Dr. Andrey Yerin, who has led the ACD/Name project since its inception, and has been involved with numerous chemical nomenclature projects during his two decades-long career as a member of several IUPAC nomenclature bodies. Dr. Yerin currently holds a position as a National Representative in the IUPAC Chemical Nomenclature and Structure Representation Division, and has held several other positions in the past. In this way ACD/Labs not only continues to implement chemical nomenclature but participates in its development by reporting our findings and making proposals for improvements.
Nomenclature beyond just names
Another nomenclature task that scientists are faced with is generating a structure from a given name. While this is generally considered to be an easier process, it is at least as tedious and prone to errors as chemical naming itself. This is why ACD/Name includes its companion Name to Structure tool. Each program has its own set of algorithms, and not every generated name can conversely produce a structure, and vice versa. Name to Structure also benefits from ACD/Dictionary, a database of over 170,000 chemical names and registry numbers for thousands of drugs, pesticides, and other registered substances.
Chemical names, while still having a very important role in compound identification and registration, have gradually become less convenient to use due to difficulties in name generation and deciphering. One groundbreaking project undertaken by IUPAC was the introduction of the IUPAC International Chemical Identifier (InChI).3 InChI is a non-proprietary identifier for chemical substances that can be used in printed and electronic documents, which enables easier linking of diverse data compilations. When the InChI identifier was introduced, ACD/Labs was among the first to provide an integration between a drawing package and the InChI software components. We also featured the ‘reverse’ InChI-to-Structure conversion followed by InChI Key, fully integrating InChI with our nomenclature toolset and structure drawing packages. ACD/Labs is a member of the InChI Trust and our team participates not only in the testing of new versions but also contributes to the development of InChI tools by participating in several related IUPAC projects and initiatives.4
Continued developments in nomenclature, the most recent example being the introduction of PINs, are changing many chemical names as we used to know them, and is absolute proof that the field of chemical nomenclature is ever-changing, and so should software be. We’re delighted that researchers, IP lawyers, teachers, and students all over the world use commercial and freeware ACD/Labs nomenclature tools every day. Referring again to the Kostner quote—we built it and they came—however; without the continued drive to incorporate all of the developments in the rapidly changing world of chemistry over the past 2 decades, we would have a software package that was better left on a dusty shelf. Instead, Name continues to prove what was stated of it in a 2006 Molecules article by Gerhard Eller.
“When compared with other naming software…the quality of names generated from ACD/Name is second-to-none.”
G. Eller, Molecules, 11:915–928, 2006
We continue to add new features to Name every year and work closely with the experts at IUPAC to remain abreast of new developments. As the 100th anniversary of the IUPAC approaches, we are excited to share in new developments with our customers and colleagues and look forward to what the next centennial will bring.
1 Chemical Substances – CAS Registry. (2018, April 30). Retrieved from http://support.cas.org/content/chemical-substances
2 IUPAC Nomenclature. (2018, April 30). Retrieved from https://iupac.org/what-we-do/nomenclature/
3 InChI Trust – developing the InChI chemical structure standard. (2018, May 04). Retrieved from https://www.inchi-trust.org/
4 Igor Pletnev, Andrey Erin, Alan McNaught, Kirill Blinov, Dmitrii Tchekhovskoi and Steve Heller. (2012). InChIKey collision resistance: an experimental testing. Journal of Cheminformatics, 4:39. https://doi.org/10.1186/1758-2946-4-39