In this episode, ACD/Labs Senior Director of Technical and Scientific Services, Karim Kassam joins hosts Jesse and Bally to dive deeper into the importance of ionization. He gives us some insight into the tools available to help accurately predict pKa along with some hints and tips to help improve accuracy in predictions.

Read more about the importance of ionization in pharmaceutical R&D in this blog post: The Importance of Ionization in Pharmaceutical R&D

Find out how a deeper understanding of ionization can help improve chromatographic separations in this blog post: Improve Chromatographic Separations: Consider Mobile Phase pH & Analyte pKa

Read the full transcript

Baljit Bains  00:15

Hey, Jesse, do you know what the proton said to the neutron?

Jesse Harris  00:19

I don’t know? What did they say?

Baljit Bains  00:22

I’ve got my eye on you.

Jesse Harris  00:25


Baljit Bains  00:27

Anyway, cheesy jokes aside, we are here today to talk about all things ionization. In the simplest terms, ionization is the process of an atom or molecule losing or gaining electrons to become either positively or negatively charged.

Jesse Harris  00:41

Most small molecule drugs contain ionizable groups so it is essential to understand the importance of ionization, especially in pharmaceutical research and development, where changes in ionization can have a huge impact on a drugs pharmacokinetic properties.

Baljit Bains  00:58

Hi, I’m Bally.

Jesse Harris  01:00

And I’m Jesse. And we’re your hosts for today’s episode of The Analytical Wavelength, a podcast brought to you by ACD/Labs.

Baljit Bains  01:09

Well, this episode, we’re joined by our senior director of technical and scientific services, Karim Kassam.

Jesse Harris  01:15

Karim is sharing insights into the importance of ionization and the tools available to help accurately predict pKa.

Baljit Bains  01:23

Let’s jump right in, shall we? Hi, Karim, thanks for joining us today. How are you doing?

Karim Kassam  01:30

Good, good Bally. It’s a pleasure to be here.

Baljit Bains  01:32

Great, and since this is not your first time on our podcast, you may be familiar with the very first question we’re going to ask you, which is, what is your favorite chemical? Last time you gave us an answer, which was serotonin. Just wondering if you stand by that? And if your answer has changed?

Karim Kassam  01:49

Well as I mentioned previously, I find neurotransmitters to be incredibly interesting. It’s fascinating how our body can synthesize these chemicals in response to internal or external stimuli. And how these chemicals have can have a such an impact on many factors of our lives, such as our moods, desires, our motivation, and overall well being. Out of the class of neurotransmitters, I’d say serotonin is still my favorite. It’s traditionally known as the feel good neurotransmitter, and who doesn’t like to feel good. And serotonin is also appropriate choice for today’s podcast, because it contains two ionizable centers.

Jesse Harris  02:32

Yes, it does. And we’re having a whole conversation today talking about ionization, which is, of course, a theme that I’m sure almost every chemist is going to be familiar with in some degree. But we want to kind of get a little bit deeper into that topic here, and why it’s important and why you should be using software to help predict your pKas. So with that context then, how about you give us a quick summary of what pKa and ionization are?

Karim Kassam  03:01

Sure, Jesse. So pKa is the negative logarithm of the acid dissociation constant of an ionizable site within a substance. This property is influenced by the local structural and electronic environment of the molecule, and it measures the strength of an acid and solution. So stronger acids have lower pKa values, while weaker acids have higher pKa values. The reason why this is important is that the pKa of an ionic site and the pH of the solution, determine the relative ratio of the ionized and an ionized forms of the molecule that are present. Another very important molecular property that is related to ionization, and pKa is logD, the distribution coefficient, this property represents the lipophilicity of a molecule with respect to the solution’s pH.

Baljit Bains  04:01

It’s great. That’s a good summary to start off with Karim. And why is it so important that we understand ionization?

Karim Kassam  04:12

Well, ionization provides insights into the behavior of molecules in solution, and is commonly utilized in various fields such as food chemistry, environmental chemistry, consumer product design, and most importantly, in chemical and pharmaceutical research and development.

Jesse Harris  04:32

So then, what are the applications of ionization to these fields, in R&D, synthetic chemistry and chromatography, etc. Can you explain a little bit about that?

Karim Kassam  04:41

Yeah, so Jesse while there are many, I’ll highlight three applications of ionization in R&D. The first will be synthetic chemistry. The second is pharmaceutical discovery and development. And the third I’ll discuss the impact of ionization on chromatography. So for synthetic chemistry understanding ionization is essential for designing and optimizing synthetic roots, predicting reaction outcomes, and controlling the relative… the reactivity of molecules. It provides chemists with valuable insights into the behavior of molecules during chemical transformations, ultimately facilitating the development of new compounds and materials with desired properties. For example, the carboxylic acid group in a molecule is more susceptible to nucleophilic reactions in its non ionized form. So chemists can exploit this information to impact the selectivity of a chemical reaction to produce the desired transformation.

In drug discovery and development, pKa and logD, which you heard about before, are essential factors that need to be considered during drug candidate selection. For example, when selecting an orally administered drug, it’s desirable for a molecule to be ionized in the acidic environment of the stomach. This prevents excessive absorption in the stomach, which can cause side effects and reduced bioavailability. However, once the drug reaches the small intestine, which is at a higher pH, it’s preferable for the drug to be predominantly non-ionized. Non-ionized molecules can more readily permeate the lipid membrane of the intestine cells, facilitating absorption into the bloodstream. Now, once the drug gets into the intended site of action, pKa can also play a role an important role in the binding of that molecule to the target protein. So, ionization affects various aspects of drug development, including absorption, distribution, metabolism, and even excretion. Therefore, focusing on ionization and its influence on the property of a molecule is vital for drug discovery and development.

And the final thing that I wanted to discuss is chromatographic separations. So chromatographic separations such as HPLC, having more than one dominant ionic form of a molecule present during chromatographic separations leads to peak broadening, and sometimes even peak splitting. This leads to issues of reliability and reproducibility of chromatographic separations. So it’s very important to select a pH range that is not close to the pKa of any one of the components within a separation. Now for this ACD/Labs has a tool within its method development software, which predicts and display displays the overlaid logD curves for the components within the separation. And therefore guides the user to select a pH range where the separation will be reliable and reproducible.

Baljit Bains  08:00

Thanks Karim, you’ve kind of touched on the the next question that I’ve got. You’ve highlighted the importance of ionization very nicely for us. And you’ve mentioned that there’s a tool within the software that can be used to help with logD, but are there any tools for helping with pKa prediction or calculations?

Karim Kassam  08:21

Yes, the experimental determination of pKa is something that’s time consuming, and costly. So a number of computational approaches have emerged over the past few decades. These include quantum mechanical calculations, techniques such as density functional theory, and ab initio calculations. Now, these can be very accurate, but also take significant amounts of time and computational resources and are typically slow especially for larger molecules. There’s empirical methods, these are methods such as group contribution or linear free energy relationships, such as Hamet equations. And these techniques, use experimental data on similar molecules to predict pKa based on structural features, and substituent effects. And finally, there’s machine learning which can be trained on large data sets of experimental pKa values and molecular descriptors. So ACD/Labs pKa predictions are, which are part of the Percepta platform, use a combination of the ladder two. The Percepta platform which we offer has physiochemical property predictions, ADME predictions, and toxicological predictions all underneath one platform. There’s a number of ways that these predictions can be deployed, we have the standard desktop installations. We also have batch predictions for calculations of very large number of molecules. We have Percepta Portal, which offers the same calculations via a web browser. And finally, there’s the Percepta kernel, which can be deployed in a cloud environment, or on prem servers, has load balancing, can be integrated with a web interface with in house informatics systems, or third party software.

Jesse Harris  10:25

There’s definitely a lot of options that can meet a wide variety of needs there. So that’s great. But I did want to kind of drill into this question of like the importance of the accuracy of pKa predictions, because I mean, obviously, like, we have a variety of tools, there’s a lot of other tools out there, as well, of course, for it, but like, why do you think that it’s important to have very good pKa predictions? Like why is it not sufficient to have just something that’s like kind of just in the ballpark, but you like why do you want higher accuracy, especially in these higher end environments that we’re talking about?

Karim Kassam  10:59

It’s a great question, Jesse. So as we discussed earlier, pKa values are very important in a number of different aspects. But not only that, pKa is used in assessment of various other endpoints. So just in our Percepta platform, for example, we have physiochemical property predictions, such as logD. And now logD is actually a combination of predicted pKa as well as predicted logP. So any inaccuracies in pKa predictions are going to influence the logD predictions. For our ADME predictions, we have something called blood brain barrier penetration, which tells scientists the rate and extent of penetration of a substance across the blood brain barrier. Now, this model also relies on ionization and pKa predictions. So any inaccuracy here will affect our ADME predictions. And then we also have toxicological predictions, for example, hERG inhibition, and hERG is a measured measure of cardiotoxicity. This is also dramatically influenced by ionization within the pKa predictions. So any inaccuracy there will result in inaccurate toxicological predictions. Moreover, many organizations have created their own in house models or algorithms for different endpoints that they want to predict that are vital to their business. And a lot of times these endpoint predictions take into account pKa as one of the inputs, whether it’s an experimental or predicted pKa, that can dramatically impact the output of these in house models. Therefore, precise pKa predictions are vital for scientists, helping them make informed decisions and understand their findings. Now, I’d say inaccurate predictions are like using inaccurate maps when exploring a new city. This can lead to dramatic confusion and a lot of wasted time.

Baljit Bains  13:11

Yes, and nobody wants to do that. Have you got any hints and tips on how to improve the accuracy of these predictions?

Karim Kassam  13:20

I got a few points. So pKa predictions have a model of applicability domain. That is, the accuracy of the predictions depend on the similarity of ionization centers found in the training set, and associated set of Hamet equations that describe the ionization. Therefore, it’s very important to have a large and diverse set for training to get accurate pKa predictions. Another way that you can improve pKa predictions is using something called system training. So using our software out of the box, a user might notice that the pKa values are either predicted too high or too low for a series of compounds. Our software allows system training which you can put in the experimental value for one or two of the compounds within this series, and the system will automatically adjust to give you better predictions throughout the series. Now a more impactful way to get improved accuracy for pKa predictions is through a partnership with ACD/Labs. We’ve had many of these partnerships over the past few decades. Specifically, we’ve had these partnerships with environmental organizations, chemical companies, and pharmaceutical companies. Most of these organizations have a treasure trove of experimentally measured pKa values. And when partnering with these companies, we’re able to develop an improved algorithm using their compounds. And this results in a win win situation. This allows ACD/Labs to offer pKa predictions within the software with a broader applicability domain. And the partnering organization gets significantly better predictions for the compounds of interest for that.

Jesse Harris  15:23

Great, well, that’s been a lot of great information. But maybe a good place to end things off on is to talk about any improvements to note in predictive tools, in recent times. Have there been any changes to Percepta that our listeners should be aware of?

Karim Kassam  15:36

Thank you Jesse, that was a very well timed question, actually, ACD pKa was introduced way back in 1997. So that’s about 27 years ago, right near the 30 years that ACD/Labs has been in business. Since that time, we’ve made progressive improvements to our algorithm by adding more and more experimental pKa data, starting off mostly with experimental data that’s found in the literature. But then, we also started working with these partnerships. So we have access to pKa data that is not available in the public domain. This allowed us to grow our database, and associated Hamet-type equations, which represent the ionization of new ionic centers. So for the latest version of our software, which is version 2023, there have been some very dramatic improvements. So to highlight, we now have over 40,000 experimental pKa values. This is the largest available compilation of curated experimental pKa values. To give you an example of how good our pKa predictions are now, for a test set of 379 pharmaceutically relevant compounds, the pKa for 99% of these compounds, was predicted within one log unit. Compare that to our algorithm a year ago, where it was 54%. So that’s quite a dramatic improvement in predictions within one log unit. We also have an R square value for our new algorithm of .98. Compare that to last year’s algorithm, which was .52. So as you can see, there’s been a dramatic improvement in accuracy of prediction within just the last year alone. Finally, we’ve also made changes to the algorithm, which have resulted in a five to 10 fold increase in the speed of calculations for pKa. This is really important because more and more organizations are starting to churn through predictions with larger and larger datasets to feed into other algorithms, such as AI and ML to make better and faster decisions.

Jesse Harris  18:00

Those are some substantial improvements. That’s really exciting to hear all that. So and yet hopefully people can dig into that and learn more about it and you can visit our website, of course, to learn more there. But thank you so much for your time today, Karim and telling us about all things pKa. And ionization and such but it’s been really fascinating.

Baljit Bains  18:21

Thank you.

Karim Kassam  18:23

Thank you Bally. Thank you, Jesse.

Jesse Harris  18:27

And that wraps up today’s episode. Thank you again, Karim for sharing your insights into the importance of ionization and pKa. And for those helpful tips to improve accuracy in our predictions.

Baljit Bains  18:38

If you enjoyed this episode, we’d love for you to recommend the show to a colleague or share it on social media. And if you want to learn more about this subject, we have some articles on our website. We’ve included the links in the show notes.

Jesse Harris  18:50

And make sure you don’t forget to subscribe to The Analytical Wavelength on your favorite podcast platform.

Baljit Bains  18:56

See you next time.


The Analytical Wavelength is brought to you by ACD/Labs, we create software to help scientists make the most of their analytical data by predicting molecular properties and by organizing and analyzing their experimental results. To learn more, please visit us at

Enjoying the show?

Suscribe to the podcast using your favourite service.