Why AI and Drug Discovery are no match made in heaven

Artificial Intelligence (AI) has been described as the ‘fourth industrial revolution’, that will lead us to self-driving cars, computers understanding human language, and automated drug discovery.

I believe the parts about self-driving cars – legal and societal barriers will likely be more important than scientific and technical ones in due course. Also, while there is debate of the quality of translations, Google Translate is good enough to be useful for me already, so I’d entirely buy into that.

The problem is: drug discovery is a very different beast.

So what’s the problem with drug discovery, then? (Note that the following is an extension of the article and subsequent discussion by Al Dossetter on LinkedIn recently.)

Let’s briefly outline the ‘drug discovery process’ first (which is only a crude generalization anyway, but it may be useful here as an overview):

PhaseHit
Discovery
Lead
Optimization
Animal
Studies
Clinical
Studies
Market
Desired OutcomeCompound active on target/in cellular assaySuitable in vitro properties (selectivity, solubility, …)Efficacy in animal model, tolerable toxicityEfficacy in man, tolerable toxicity, better than standard of careCommerci-
ally viable (market size, market need, pricing)

So what we really care about when doing drug discovery are the in vivo results – we don’t want to treat a protein with a drug, or  a cell line, or a rat; we want achieve efficacy, with tolerable toxicity, in humans.

So in which way can AI now support the different phases of drug discovery?

Conceptually, it can be useful in any of the above steps, leading to the increasing ability of discovering hits, optimizing in vitro properties …. and thereby providing increased likelihood of in vivo efficacy, at tolerable toxicity.

BUT – AI needs data, and this is the weak point when trying to apply ‘AI’ to the drug discovery field. This is not object or speech recognition, where we have a huge amount of both labelled and unlabeled data.

Let’s hence now examine three criteria for data in the drug discovery process we discussed in a previous post:

  • Amount of data available,
  • Reliable labeling of data, and
  • Problem relevance (here for the in vivo situation).

Let’s see which data we have available related to the different phases above – leaving out market considerations here, and sticking to the scientific goal of finding a bioactive entity that is able to cure disease:

PhaseKey data availableAmount of dataConsistent data labellingProblem (in vivo) relevance
Hit discoveryBioactivity, solubility, …+++o
Lead OptimizationSolubility, permeability, off-target activities, simple DMPK ++o
Animal StudiesEfficacy and toxicity data in animals, animal PKo+
Clinical StudiesHuman endpoint efficacy data, human safety, human PK++

So what we see is: In early phases we have more data, which is more clearly labelled – but it is less relevant to in vivo outcomes, such as efficacy. In late phases we have data that is more relevant to in vivo outcomes, but we have very little data available in general.

To support the above statements with some facts – in databases such as ChEMBL, ExCAPE or PubChem we have millions of bioactivity datapoints, linking compound structures to protein targets. But activity against a target does not make a drug, far from it. So we have lots of data that is insufficient to understand and anticipate the in vivo situation.

On the other hand, in databases such as ToxRefDB, DrugMatrix or Open TG-GATEs we have in the order of (a low number of) thousands of compounds covered with animal toxicity data – in a chemical space that comprises 1033 (or so) compounds in total. So we have likely more relevant data at hand – but for very few compounds, since generating such data is costly (eg the DrugMatrix data generation has cost in the order of $100m).

What is now meant by ‘Consistent data labeling’?

Imagine a consumer clicks on an Internet link and buys a product – here you have clear data points, unambiguously connecting the dots between clicking on a link, and buying a product. However, whether a drug shows efficacy in a disease (or toxic side effects) depends, at the very least, on dose, route of delivery, and individual genetic setup of the organism and the disease (i.e., the endotype), among many other variables. So there is no clear label one can assign, such as ‘drug X treats disease Y’ – yes, sometimes, but sometimes not, depending on the context of how and in which context the drug is applied to a particular organism. Hence labels in the biological domain are generally much more ambiguous, and context-dependent, than in other domains.

(In many cases we simply also don’t know which early-stage data are predictive of in vivo effects – an article published just last month concluded for example that “Chemical in vitro bioactivity profiles are not informative about the long-term in vivo endocrine mediated toxicity“.)

Of course there have been notable successes around ‘AI in drug discovery’, for example in the areas of synthesis prediction, automated chemistry, bioactivity modelling, or using image recognition to analyze phenotypic screening data. These are all important areas to work on – however, they are also a good number of steps away from the more difficult biological and in vivo stages, where efficacy and toxicity in living organisms decide the fate of drugs waiting to be discovered. Hence, there is still a gap that needs to be bridged, in an area that needs progress most, namely in vivo efficacy and toxicity.

So AI and ‘drug discovery’ may not be a match made in heaven – but that’s not necessarily a problem, since we live on earth anyway. There is obviously ample data around in the drug discovery process, the amounts available will increase, and we need to analyze them, so much is clear. Quite possibly, from what I can see, AI will be used more for deselection (rather than positive selection), to increase the odds of success. But we certainly need to learn which models matter for the in vivo situation, instead of just ‘plugging data into the machine’, no matter their relevance for the human setting, and hoping to get the right answer out.

The question is hence where we have, at the same time, sufficient and sufficiently relevant data in order to predict properties of potential therapies that are relevant for the in vivo situation, which are related to efficacy and toxicity-relevant endpoints. We will explore concrete examples in future posts.

/Andreas