Abstract
We introduce the QuanSA method for inducing physically meaningful field-based models of ligand binding pockets based on structure-activity data alone. The method is closely related to the QMOD approach, substituting a learned scoring field for a pocket constructed of molecular fragments.
The problem of mutual ligand alignment is addressed in a general way, and optimal model parameters and ligand poses are identified through multiple-instance machine learning. We provide algorithmic details along with performance results on sixteen structure-activity data sets covering many pharmaceutically relevant targets. In particular, we show how models initially induced from small data sets can extrapolatively identify potent new ligands with novel underlying scaffolds with very high specificity.
Moreover, we demonstrate that combining predictions from QuanSA models with those from physics-based and molecular simulation approaches is synergistic. QuanSA predictions yield binding affinities, explicit estimates of ligand strain, associated ligand pose families, and estimates of structural novelty and confidence. The method is applicable for fine-grained lead optimization as well as potent new lead identification. The incorporation of molecular simulation techniques further enhances the predictive power and accuracy of the QuanSA models, offering a comprehensive approach to drug discovery.
Scaffold replacement as part of an optimisation process that requires maintenance of potency, desirable biodistribution, metabolic stability, and considerations of synthesis at very large scale is a complex challenge.
We present results on the extent to which physics-based simulation (exemplified by FEP+) and focused machine learning (exemplified by QuanSA) are complementary for ligand affinity prediction.
This article discusses logic fallacies in the context of off-target predictive modelling.