A practical guide to implementing AI
In this ebook, you’ll discover the key considerations which every leader needs to take in order to successfully implement AI in their drug discovery pipelines.
A key difficulty in finding new drugs is the sheer size of chemical space. The number of potential drug-like molecules is incomprehensibly vast: an often quoted figure is that there are ~1060 molecules obeying Lipinski’s rule of five (Reymond, 2015). This figure is notable for exceeding the total number of atoms in the entire solar system by a factor of several thousand. More conservative estimates are available. However, even at the bottom end these are still way beyond scales that are explorable, or even conceivable, for human beings.
The key selling point of generative chemistry is that it offers a way to automate exploration of chemical space. It can mine the best compounds for your specific problem and present you with an array of optimised solutions that you never would have considered otherwise. In practice, of course, things are rarely so simple. But generative chemistry still offers a valuable tool in the chemist’s arsenal.
While generative AI methods steal all the headlines at the moment, it is worth remembering that generative chemistry predates the current AI boom (a good indicator that the field has value beyond the hype). Traditional generative methods typically rely on combining simple building blocks from libraries of known scaffolds, fragments, or reagents (Sadybekov, 2022). These methods offer fast, powerful, and simple ways to explore relevant chemical space. They also avoid some of the pitfalls that can occur with AI techniques. However, these methods tend to have less flexibility than deep-learning based methods. This is because they are limited to combining a fixed set of building blocks. In contrast, an AI model can provide a deeper understanding of a larger chemical space.
AI approaches rely on neural networks to generate new compounds meeting some set of criteria. These networks fall broadly into two camps: diffusion models and sequence models.
Exactly which type of model architecture is going to give better performance will depend on the specific problem at hand, and is arguably less important than how the model is deployed. Data scientists naturally focus on metrics. However, chemists mostly don’t care if a model is 96% accurate or 97% accurate. They care about seeing interesting, relevant compounds in a way that integrates with their existing workflows cleanly.
A key difficulty with this is determining what constitutes an interesting, relevant compound in the first place. Models can be trained to optimise for a given set of parameters or constraints. The output compounds can then be filtered for further constraints. However, ultimately, it is impossible to account for all possible variables. This necessitates some level of expert human oversight, either to filter down the generated compounds after generation, or to work alongside the AI guiding the process.
An analogy I like here is chess. For a long time, the combination of an AI and an expert human player would outperform either individually. This is because the human could provide large-scale strategy and big-picture thinking that the machine lacked, and the AI could provide the brute force move evaluation that is impossible for human brains. This is no longer the case for chess, as computers have got ever more powerful. Still, drug discovery is vastly more complex and ever-shifting. Therefore, it seems likely to me that human oversight in some form will remain crucial to the process indefinitely.
AI models are not magic, and while they can be extremely powerful, they are also fallible, often in ways that a human would find laughably foolish (see Chevrolet’s AI assistant agreeing to sell a car for a dollar, or Google Gemini recommending the addition of glue to pizza toppings). We should treat generative chemistry AIs as another tool to empower expert chemists, not as a substitute for them, and build our software around that goal.
Can generative chemistry AI instantly solve all your problems and replace half your staff? Despite the breathless claims of LinkedIn influencers, no it can’t. What it can do is help chemists explore the vastness of chemical space, accelerating the optimisation and development of lead compounds and uncovering exciting new drugs that otherwise they might never have seen.
If you want to see what my colleagues and I have been up to, and understand how Optibrium’s generative chemistry methods within Nova and Inspyra work, you can watch our webinar on-demand: ‘An augmented approach to generative chemistry ’.
Michael is a Principal AI Scientist at Optibrium, applying advanced AI techniques to accelerate drug discovery and improve decision-making. With a Ph.D. in Astronomy and Astrophysics from the University of Cambridge, he brings a data-driven approach to solving complex scientific challenges. Michael is also a thought leader, contributing to discussions on the impact of AI in pharmaceutical research.
In this ebook, you’ll discover the key considerations which every leader needs to take in order to successfully implement AI in their drug discovery pipelines.
Have advances in AI and deep learning reached a threshold whereby generative chemistry methods are redefining drug design? This webinar…
This paper, co-authored with our colleagues at NextMove Software, explores applications of Matched Series Analysis within StarDrop’s Nova module to…