ACD Labs Logo

Discovering the Fourth Paradigm in Data Standardization

Discovering the Fourth Paradigm in Data Standardization

Practical Method for Managing Modern Scientific Information

Authors: Graham A. McGibbon and Andrew Anderson, ACD/Labs

Published online at GEN, April 2018.

Graham McGibbon and Andrew Anderson of ACD/Labs write about how the externalization of R&D activities and the deluge of instrumental analytical data generated on a daily basis has resulted in increasing interest in analytical data standardization.


For scientific investigation, observational and instrument-derived data is the lineage of information which provides knowledge that enables managers to make strategic and tactical data-based decisions for actions that maximize benefits and limit risks. Data exchange between organizations and data sharing inside organizations is necessary to effectively communicate this “define-to-decide” lineage. Digital data standards are intended to facilitate dealing with the data deluge together with valuable master data, such as reference spectra. This deluge often results from the increasing volume, velocity, variety, and variability of newly acquired analytical data—necessary for confident, comprehensive material characterization as will be described below. This comprehensive characterization necessitates digital representation of analytical data, chemical processes, and molecular structures.

In data workflows—not only Big Data, but analyses generating a variety of not-quite-so-big data—two factors contribute greatly to the deluge. The first aspect is the automation and/or parallelization of specific high-throughput analyses on particular instruments. The second is the challenging implementation of the so-called “Internet-of-things” (IoT) due to the tremendous assortment of computer-based data sources and their diversity of parts, performance attributes, and output of analytical data formats.

Ongoing heterogeneity of analytical-data formats is thus a natural hallmark of technology advancement. A summary of some of those analytical-data-standardization efforts beyond human-parsable, purely open generic formats such as ASCII text or XML are noted in a recent white paper. Standardization of ontologies is one of the interesting, yet challenging, endeavors of modern scientific information management.

Access the complete article online here.