Linking to Meaningful Data in an ELN World

April 2, 2008

by Ryan Sasaki, NMR Product Manager, ACD/Labs

In a previous post, I asked the question, why does paper spectra continue to persist in chemistry?

Of course there is the next challenge, as Rich Apodaca points out on his Depth-First Blog in an earlier post:

The previous article in this series, suggested that the same dynamic applied to the compilation, management, and sharing of spectral data by chemists. More to the point:

… cheminformatics has failed to deliver an inexpensive, robust, and truly usable solution to the problem of compiling, managing, and sharing spectral data for scientists of average computer skills. …

To be sure, there are tools that address parts of the problem. But no solution addresses them all and that’s why scientists and publishers resort to using obviously inferior solutions like PDFs.

Whether or not organizations and groups are resorting to inferior solutions is up for debate because it of course depends on the expectations of the end user. But his comments definitely struck a chord with me.

So the next question is:

“What is the best way to connect my analytical data to my ELN records TODAY?”

By far, the most common way that I have seen organizations connect the analytical data from our software to ELNs is via PDF.

But as Rich mentions in yet another post, for people who are looking to build on experiments or model or compile the results, static PDF images are practically useless.

I couldn’t agree more.

So why do organizations choose this route?

The three biggest reasons I have heard are:

File size limitations in the ELN
The lack of a standard and supported analytical data format that is generic, open, lockable, and widely supported for years to come.
Currently, PDF is more controlled for legacy support than analytical data.

As a result, PDF is the only reasonable approach for many, and it is certainly better than not connecting to a record of the data at all.

I think the key is for vendors to work horizontally and to combine their strengths to deliver as Rich suggests a:

an inexpensive, robust, and truly usable solution to the problem of compiling, managing, and sharing spectral data for scientists of
average computer skills.

But the file format remains an issue.

Work by the ASTM E13.15 Commitee has been ongoing for the past 5-6 years towards a universal analytical data file format. This file format is called AnIML (Analytical Information Markup Language), the developing XML standard for analytical chemistry data. Most vendors support the general directions of the ASTM E13.15 for a universal data format for analytical data.

A
final note on the role of MEANINGFUL data in an electronic world. When I refer to meaningful data, I am referring to knowledge gained and stored in an actual data file as opposed to a static PDF. One of the unique features that ACD/Labs has maintained over the years is the ability to electronically assign NMR data to chemical structures to truly capture not only the data but the knowledge gained from the experiment. I think not leveraging this knowledge is an awful shame, especially in an electronic world, but I think it will come.

As of right now, While it is common that NMR Spectroscopists will assign their data electronically, it is very rare to find a group of chemists in the pharmaceutical industry, for example, who routinely use their processing tools to assign their data. Why?

They might not have the right software tools
It is not required. In fact, in some cases I have learned that it is forbidden. Why spend the time it takes to assign the data if it is not required or permitted?

A static PDF is indeed proof that an experiment was run, but does it contain information that supports a proof of the proposed structure? Where is the knowledge that was gained from this exercise?

I think 1D NMR Assistant significantly reduces the amount of time it takes to electronically assign a spectrum so now it is just a matter of finding an easy way to tie this assigned analytical data to the ELN.

I think there is a real opportunity here.

What are your thoughts?

Would you prefer electronic data over PDFs?

Is simply raw or processed data enough?

How important is maintaining the knowledge gained from the experiment (i.e. assignments)?

Thanks to Rich for the multiple inspirations for this and previous posts.

About the Author

Ryan Sasaki

NMR Product Manager, ACD/Labs

More Posts From Ryan

2 Replies to “Linking to Meaningful Data in an ELN World”

In our particular case at, Lexicon Pharmaceuticals, we do actually take a dual approach to the electronic data availability issue. For high level overview purposes, we capture the PDF. The PDF however is the accessory data that only became available after we went to all the trouble to electronically reference our NMR data in a database for all of our instruments. The type of data you access depends on whether you are logged into our Analytical LIMS or our compound registration database. This is dependent on what you are doing with the data. A PDF of a spectrum generally is much faster opening than the actual full electronic spectrum that is reprocessable. If you just want to view the data a look at it for a few seconds why bother to wait the 10 to 30 seconds (or more if its 2D) to open it if all you want to do is glance it over for historical reference or other purposes. On the other hand, if you need to work up the data for a publication or patent filing, or for general examination/elucidation, then you need the full power of a reprocessing tool. Neither PDF or processor solution alone will meet both needs, and of course PDF viewers are free and processor licenses are … not.
Of course, we are still providing paper output for the “2 second” glance. No need to login and search on criteria to get your spectrum. This is the activation energy barrier that needs to be overcome. A great way (but of course expensive) to do this would be to provide each chemist with a tablet or notebook/laptop PC at their hood. I think this is also what is essential to make ELN’s really work too, so if you already have this in place, then the ability, as an organization to eleminate the printout of paper becomes stronger since you now reduce the intervening step of walking from your fume hood/work bench back to an analytical lab to pick up your LCMS or NMR data.
AS far as electronic assignment, this is being done more and more now, and as it is essential to improving our built in automated NMR verification through perdiction DB training, it is now being reinforced by rewarding us with improved results while simultaneously allowing us to build DB’s of assignments. AS we continue beyond 600 automated verifications for our registration compounds through open access HSQC NMR data, this continues to influence our direction. Tight integration of all these components and workflows begin to bring us closer to the realization of “less paper” … but not paperless.
Phil

For those afraid of computers or so attached to pen and paper they can’t change to an all electronic lab notebook, there is a review on a magic pen that gives you the Onenote benefits (organization, searchability,…) while still letting you doodle away with paper and pen.
http://e-lab-book.com/?p=372