We all know that data is powerful, but are you maximizing your data?

Our very own Jesse Harris recently attended ACS Fall 2023, where the theme this year was “Harnessing the Power of Data”. Not only did he present, but he also conducted a survey to gather intel about the use of chemical software. Speaking with a variety of backgrounds, Jesse delved into details about analytical data processing software uses and challenges.

This conversation with Jesse lets us dive a little deeper into the responses to his survey and discusses some data handling highlights from ACS.

Read the full transcript

Jesse Harris  00:00

That means that they have this sort of patchwork of a bunch of different tools that they’re switching back and forth between in order to get their work done. And while that can be a solution to saving a bit of money, it leads to some very serious problems around, you know, file compatibility, file storage issues there, there’s some inconsistency in how your data is been processed, maybe. So I mean, obviously, the like the the specifics is going to matter on the particular application that you’re talking about, but it does lead to a lot of challenges.

Baljit Bains  00:46

I’m sure all our listeners know that data is powerful, but are we maximizing the power of our data?

Sarah Srokosz  00:53

As scientists, we know that data is gathered pretty much all day every day. However, that data needs to be easily accessible and shareable to be of use. The software tools and technical training around these can help bridge gaps in analytical data management.

Baljit Bains  01:08

Hello, everyone, I’m Bally.

Sarah Srokosz  01:10

And I’m Sarah. We’re hosts of the Analytical Wavelength, a podcast about chemistry and analytical data brought to you by ACD/Labs.

Baljit Bains  01:19

You may have noticed that one of our usual hosts, Jesse, is not here. We’re mixing things up with Jesse joining us as a guest on today’s show.

Sarah Srokosz  01:27

He recently attended ACS Fall 2023 in San Francisco, where he presented a talk and a poster and spoke to attendees at the ACD/Labs booth about how they harness the power of their data.

Baljit Bains  01:39

He also conducted a survey about how scientists are using chemical software and obtained some interesting results regarding what students and others in academia were doing with that chemistry software and the various limitations they faced. He joins us today to share his insights. Hi, Jesse, welcome to the podcast today as a guest, how are you doing?

Jesse Harris  01:59

Doing great. Yes, a longtime listener, first time caller, I guess, you know, something along those lines, but yeah no, very, very happy to be here on the other side of the microphone.

Baljit Bains  02:09

Good. We usually ask our guests an icebreaker question. And you’re probably familiar with this because you usually ask the question, but what is your favorite chemical?

Jesse Harris  02:17

My favorite chemical today I think I’m gonna go with caffeine. I know that this is an answer that others have put out there as well. But you know, it does some very good work, I will say. So big fan of caffeine, I’m enjoying some some coffee here this morning. And it’s going to be doing a lot of work for me this week. So yes. Big fan of caffeine.

Baljit Bains  02:37

That’s fantastic. To get right into the questions. What are some of your highlights from ACS?

Jesse Harris  02:43

Yeah, so the American Chemical Society Conference happened in San Francisco back in August. It’s a massive conference. For those who haven’t been like this is actually my first time going to a conference for ACD/Labs. And I think that it was something like 10-15,000 people who attended, which is just wild to me, of having so many people kind of in the same place. But yeah, a lot going on there. I got to see the exhibition hall, a bunch, didn’t get to see as many of the talks as I would have liked. But I did get a chance to sit in a couple of the sessions that were really interesting talking about what have you companies are doing with their data and things of that nature. So it was a lot of fun, though food was fabulous there as well. So lots of, very enjoyable trip overall, and really fascinating conference. That’s great.

Sarah Srokosz  03:35

So you mentioned a few people spoke about how they’re using their data. And the theme was harnessing the power of data. So how did they incorporate this overall into other aspects of the event?

Jesse Harris  03:46

So definitely for some of the sessions that I was in, there were only a small handful that I was able to attend, unfortunately. But there was a whole session that was talking about FAIR data usage. So that’s Findable, Accessible, Interoperable and Reasonable. There’s a really great paper on that subject, that will be linked in the show notes later on, for people who aren’t familiar with the concept. But the idea there is about basically being able to access data, analytical data, so that that is useful for folks later down the line. And there was a session about that, that was mainly focused on how, what that means in academia, and having data be accessible there.

There was a session that I presented it actually that was, it was very interesting, very informative. I was speaking about Excel and the downsides of it, the limitations of it in pharmaceutical development. And there was a couple other folks who were presenting in that same session, and IBM was there talking about some of the stuff that they’ve been doing. So it’s really clear that there was a lot of enthusiasm and interest, like Roche, I think was presenting there as well about some of their work with AI and using predictions as opposed to experimentation. So it’s really clear that the industry side especially is doing a lot in terms of data, but then some of the sessions that I was also in; I was participated in a poster session where I got a chance to talk with some librarians actually at some universities, who are often kind of responsible for managing some of the software and teaching graduate students how to use the software that they use in the course of their research. And it was very clear from talking to them that the lessons have not always translated into academia, which was part of the reason why I’m so happy that we got a chance to do this survey that we did, that was introducing a little bit of, getting a sense of what the students and academia has been doing with their chemistry software, and maybe some of the limitations of what’s going on there.

Baljit Bains  05:43

So before we get into the results of the survey on chemical software, can you give us a glimpse into some of the conversations that you had? What field or industry were people coming from? What were the main areas of either excitement or concern?

Jesse Harris  05:57

Yeah, so that would be good. There was the one that I highlighted there about with some librarians, one of the things that kind of came up there a lot was that they kind of felt that a lot of the times, students don’t really have a lot of places to learn about the software options that they have available, or that they are often kind of picking it up from just other graduate students within their lab, or whatever the university is willing to pay for it, which is often the cheapest alternative that’s possibly available. And when you are kind of making decisions about that, in terms of like the software that you end up using, you’re not getting either like sufficient training, or you’re not using the tools that are the most up to date. And so when I talked to them a lot about that, they were all saying that this is something that like kids are asking for, students are asking for and are interested in.

I also had a chance like recently, I also, there’s a blog that actually came out on our website that was based on a conversation with a Professor from the University of Toronto, talking you about some of the uses of ChemSketch there in academia, and some of the things that came from that conversation was also around how students these days oftentimes don’t even have very much technical background. I think there was a stereotype for a long time that young kids would always be more technologically savvy than the people who came before them. And that, you know, we would have a bunch of little tech geniuses running around. But that’s not kind of like how technology has transmitted over time. And there’s a lot of students these days who don’t have as much training and as much background in these technical programs. I mean, there’s some that do there’s some that are fabulously, you know, ahead and and doing really well with that, but that’s definitely not uniform. So I think that’s something that there’s a lot like some big questions, I think of what is being done to teach students to have the technical skills, the software, digital skills in the chemistry area, so they’re gonna be successful in their career, no matter what that ends up being, either in academia or in industry.

Sarah Srokosz  07:52

Yeah, I think that is really well represented. And one of the questions in the survey, 94% of respondents, both academia and non-academic, agreed or strongly agreed that students should receive more training and experience using chemical software. And in particular, those in academia overwhelmingly chose strongly agree. So what do you make of these results? Based on your conversations? Is it just that, you know, younger generations are less comfortable with tech than we, you know, kind of anticipated that they would be? Or is there something else that is missing there?

Jesse Harris  08:29

Yeah, this was one that I was surprised by, even based on my conversations, it was really overwhelming. So let me just quickly mention, though, about the survey, it was a kind of a small survey, so I don’t want to overreach into any of the particulars of the numbers. I think that sometimes when people get surveys like this, they can see like, you know, like, oh, yeah, 95% exactly of how many people are saying something like this. But I think what we can say though from this specifically, is that overwhelmingly in this direction that both the people in academia and outside academia, and also within the people who were answering there was a mix of people who were professors, or people who were graduate students, undergraduate students. So it was it was a nice cross section there.

But I think that it’s basically, there’s a universal agreement that there needs to be more training on this and that ends up being something that I was kind of just surprised, like, why is there such a gap here? And why is there nothing being able to fill it and especially if the students want this kind of education, because like they were very overwhelmingly like, once again, saying that they needed more of this training, like why aren’t universities kind of feeling that need and just kind of a basic, you know, consumer way of thinking there. So it’s really fascinating to me as to why. I would love to have more chances to talk to all of them about why they are thinking this is the case, but it’s pretty clear to me that there’s a big desire from across the board of wanting more digital skills from our chemistry students.

Baljit Bains  10:00

From some of the other questions in the survey, you found that the large majority of respondents are using more than one piece of software in the work. So did you get kind of an idea of what kind of software they’re using? How does this vary based on the field or the industry? They’re in? What kinds of challenges this can potentially pose to them?

Jesse Harris  10:18

Yeah, this is something that we’ve seen a lot in the other survey work that we’ve done too, that most chemists are using a bunch of different tools. And one of the reasons that I really wanted to ask this question was actually based once again, on that conversation that I had with the professor from the University of Toronto, because one of the things that he sort of mentioned there was that he’s switching between a lot of different programs to do the work that he’s doing, and that a lot of academics, especially because they’re trying to use, you know, free software, whenever they can, it means that they have this sort of patchwork of a bunch of different tools that they’re switching back and forth between in order to get their work done. And while that can be the solution to saving a bit of money, it leads to some very serious problems around file compatibility, file storage issues, there’s some inconsistency in how your data has been processed, maybe. So I mean, obviously the specifics are gonna matter on the particular application that you’re talking about. But it does lead to a lot of challenges that are downstream of that. And that maybe choosing a software package that is able to do more would be a good way to improve the efficiency and the quality of the work that you’re doing there.

But in terms of how things varied between industry and academia, I think that we saw that the number of different tools that they were using for software was kind of similar, but in terms of which software they were using, it seemed to be quite different.

For the academics, I think that some of the most popular ones, like chemical drawing was popular for both of them, chemical property prediction was similarly popular. But for academics, modeling, molecular modeling, and computational chemistry was much more popular compared to the non-academics that we talked to you.

But then for the non-academics, they were saying that analytical data processing was one of their major usage, which was a little bit surprising to me, because not that they were using it but the academics weren’t. And what it made me think, is that maybe like people who are using these various analytical methods, they’re using instrumentation or they’re using software that might be attached to the instrument or attached with the computer that’s attached to the instrument. And they maybe do not think of it as being software exactly. They just kind of see it as being part of the same tool, maybe. So that was a little bit surprising to me as to why that that was relatively low, because it was about 40% for the academics who were saying that they were using that, while 80% of the non academics were using it. That’s a pretty big gap there. But yeah, and then another one, too, is the academics, they’re starting to use electronic lab notebooks, from what I heard, there was about 20% of them said that they were using electronic lab notebooks, which was even by the time that I was in university, I didn’t know anybody who was using an ELN at that point. So that might be also kind of this response to some of these graduate students who are trying to get ready for the workforce, of having… getting used to using these technologies.

Sarah Srokosz  13:16

Wow, yeah, all that is really interesting. If we hone in a little bit on the analytical data processing software that you mentioned, because obviously, that’s our bread and butter. What do people’s workflows look like? So like, where are they using the software? You mentioned that they might be using the software attached to the instrument? Is there anywhere else that they do it? And then where are they storing the processed data and their insights from their analysis?

Jesse Harris  13:42

Yeah, this is one that the that there was a pretty big difference of what we saw between the academics and the non-academics, for the non-academics, we saw that there was a pretty even mix between storing their data in centralized places, or using cloud based solutions, and on dedicated live computers, as well as on computers that are like personal by like that, I mean, like one that’s associated with you, and maybe not like your, like your home computer. But like a dedicated work computer, maybe. Compared to in academia where the majority of folks were seem to be using just their personal computer or maybe like one other place. That has some advantages and disadvantages, like A) it means that your data is easier to find, because it’s often going to be in fewer locations, though there is some overlap, once again, like we didn’t see them storing across multiple locations still. But you also have the downside then of you know, how will people access your data once you’re gone? You know, because that’s one of the things about academia is that you have these graduate students who are showing up and they do their work for a few years, but they aren’t able, like they leave, you know, they’re not gonna be staying in the lab for you know, 10, 20, 30 years for the most part, and I mean, obviously, if you’re a professor that’s a little bit different, but the graduate students who are storing on their their laptops and such as well is the impression that I’m getting here. So that’s going to be a concern that’s more of the downstream bit.

In terms of using like where they’re processing their data, it seems to be kind of a mix between using software that’s accessible on their dedicated computers, those that’s accessible on a, you know, a computer that’s associated with a laboratory instrument, or using browser based applications. So they’re doing across all of those. Now, it’s hard to tell exactly what they if they understood that we’re talking about analytical data here, specifically, because it’s possible that they might have just thought of like, okay, I’m using browser based applications for doing other chemistry research related things. So I was pretty surprised by seeing 30% in both categories of academics and non-academics saying that they were using their browser based applications processing their data, because I don’t think there’s really quite that widespread yet. But maybe I’m mistaken there. But definitely there is a people using a lot of different places, we see that same type of patterns that we’ve seen before about a lot of different tools, data being stored in a lot of different places, and the challenges that come from that interoperability and management from using a lot of different tools.

Baljit Bains  16:19

That’s great. It sounds like you got a lot of good insights from these conversations that you had. Is there anything else that stood out to you from the results of the survey?

Jesse Harris  16:26

Yeah, so there was, I think one more insight that I kind of noticed and it was the number of analytical methods being used and the type of analytical data that folks are using too. It was something that we saw in academia that they were on average using 1.8 methods, while in non-academia they’re using 2.6. So about a 1 difference in terms of you know, most in academia, most are using 2 some using 1, and then in non-academia most using 3 and some using 2. And that like, bump up in complexity, once again, is I think one of those places where the academics are, maybe there’s a gap in the preparation of getting used to what will be necessary for for succeeding in industry.

And then even in the types of analytical data that they tend to be using, I was kind of surprised by chromatography being a big difference there. The non-academics seem to be using a lot more chromatography than the academics with roughly 40% saying that they were using it in academia and almost 80% saying in non-academia, so not quite double, but pretty close. And so that’s just one of those, once again, these types of data that I think that the students should be prepared to use when they’re getting into industry.

And just across the board, it felt as if there were a lot of opportunities here for students to get their hands on more data and more software, and getting better quality stuff. Now it makes a big difference in them being up to the standards of what is expected in the industry. So as of right now, I would say that there’s a lot of students who are not getting through the preparation that they kind of need and deserve in order to get in there. And I would think that there’s opportunities for improvement.

Sarah Srokosz  18:10

Absolutely. Well, thank you so much, Jesse, for joining us as a guest today on the podcast and sharing a little bit about this fall’s ACS.

Jesse Harris  18:19

Yeah, my pleasure, lovely to chat with you and to share what I’ve seen. And I would, of course, love to hear back from anybody on LinkedIn or anywhere else, if you have thoughts on how people are using data, especially for people who are in academia, of what you see the trends being there, because you know, all of that will inform things for us, getting a better sense as to what the needs are there,  so would love to continue this conversation elsewhere.

Sarah Srokosz  18:44

Thank you, Jesse, for your insights into the responses to your survey. It was really interesting to hear from someone on the ground at ACS and about where the gaps in learning and teaching are. This certainly provides some food for thought for educators and insights as to how we can better support early career scientists.

Baljit Bains  19:01

That’s all for today. Thanks, as always for spending time with us. And don’t forget to subscribe through your favorite podcast app. If you have been enjoying the show, we would really appreciate it if you would recommend it to a colleague or share it on social media.

Sarah Srokosz  19:13

See you next time.

The Analytical Wavelength is brought to you by ACD/Labs, we create software to help scientists make the most of their analytical data by predicting molecular properties, and by organizing and analyzing their experimental results. To learn more, please visit us at www.acdlabs.com

Enjoying the show?

Suscribe to the podcast using your favourite service.