August 31, 2021
By Richard Lee, Director, Core Technology and Capabilities at ACD/Labs
Data is king. Every company wants to leverage legacy data and real-time data to make decisions. COVID-19 expedited digital transformation efforts, and companies realized that staying ahead of disruptions involved generating insights and value from accessible data to make decisions. Therefore, it’s critical that every organization, especially those in science, employ workers with data science skills.
Over time, data scientist will not be a single position within an organization. Rather, all scientists will need to incorporate skills within their everyday work that currently fall under the umbrella of data scientist. For example, analytical chemists use data every day. Results from experiments are used to make decisions (about compound purity and identity, and/or whether to proceed with the next experiment).
A data scientist combines computer science, statistics, and mathematics to analyze, process, and model hundreds/thousands/millions of data points. The result is a model that will support faster decision making and planning, and help scientists predict outcomes for more focused experiments.
Making Data Scientists the Norm
It’s clear that a skills gap exists in the industry today and solving it starts with equipping the future generation of scientists with an understanding of how data and advanced technology support each other. For example, machine learning (ML) in the laboratory can use data to help predict potential outcomes—the likely success of a reaction or a suggested chromatographic method. But it requires well-curated data (the domain of an experienced scientist) to be effective. This capability is not widespread in the life sciences just yet. To get there, scientists need to get comfortable working with data in a machine learning mindset.
Currently, many graduates are experiencing a disconnect between what they learn in the classroom and what they do in the professional laboratory. On a recent episode of The Analytical Wavelength podcast, Michelle Hill, Program Manager at Career Ahead, spoke about readying science students for a career in the field. Michelle notes that students have difficulty realizing and articulating transferable skills such as communication, problem-solving, analytical thinking, innovation, teamwork, and collaboration.
Insufficient focus on explicitly developing these skills prevents students from seeing how their experiences transfer to future job environments. Graduates that do have a combination of technical and transferable skills are well-equipped to connect dots across different contexts, thus setting them up as versatile job candidates.
Introducing Coding in the Lab
With transferable skills like innovative and analytical thinking, scientists bring a peripheral vision to their work, leading to a greater willingness to try new technologies and techniques, like coding. Though computer programming isn’t currently a required course for most physical science students, it’s a valuable addition to undergraduate and graduate skillsets. And as coding becomes a more accessible skill outside of traditional software engineering degrees, a close integration between science and programming becomes possible.
Coding can take scientists from participants to owners of their data. Often, scientists will conduct experiments and evaluate success, calculate generated yield, then archive the data in a repository. This process of one experiment leading to a single decision, while necessary, limits the application of that data within the organization.
Coding enables scientists to use data science, and eventually machine learning, to expand their use of data for trending, to gain insights, and ultimately for outcome prediction. For example, reaction optimization or catalyst screening in high throughput chemistry could make greater use of data science models. By modelling the yields of both unsuccessful and successful reactions as a function of experimental parameters, a scientist can create a model to find the best catalysts and reaction conditions. When data from hundreds or thousands of experiments are combined, the statistics may reveal unexpected insights.
Students can enhance their future performance in the workforce by gaining more computer and technical skills. But they can also set themselves apart during recruiting. Such skills are in demand within the industry, which has recognized the existing skills-need gap. Adding them to your resume can only make it more attractive to future employers. If you’re in science and know how to code, you can write your ticket for your career.
As more scientists also become data scientists, the move towards digitalized data will accelerate. Industry’s data digitalization will improve, more information will be available to make accurate decisions, and existing data will be used more broadly to find new insights.