This website stores cookies on your computer. These cookies collect information about how you interact with our website and allow us to remember you. We use this information to improve and customise your browsing experience and for analytics and metrics about our visitors on this website and other media. To find out more about the cookies we use, see our Privacy Policy.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked.

Physical model induction with QuanSA™: Affinity prediction that is synergistic with simulation-based methods

Authors

Ajay N. Jain
Ann E. Cleves

Download poster PDF

Introduction to QuanSA: Quantitative Surface-field Analysis

Affinity prediction challenges:

The things we want to predict are in the future. They do not come from the same statistical population as the molecules/activity-data from which we can induce models. This violates the central assumption of machine-learning: predict on things that come from the same population as things used to train.

QuanSA uses a surface representation:

To address these challenges, it is necessary to use a physics-driven domain knowledge in the model induction process. The actual molecular surfaces and their properties are not well represented by the atom/bond depictions used to symbolize molecules. Surfaces are congruent even when they don’t look like they should be.

QuanSA: Quantitative Surface Field Analysis

Learn more about QuanSA, part of our BioPharmics platform

The QuanSA method

To define a ‘pocket field’, an initial alignment of all training molecules is constructed and function parameters at the observer points are learned based on activity data¹.

The QuanSA pocket field is iteratively refined using multiple instance machine learning; considering multiple poses for each compound means that no assumptions are made about the ‘right’ pose.
Building/applying a model is tractable, taking just hours to build or refine.
QuanSA models require no known target; models can be informed by protein structure or applied on purely phenotypic data.
A new molecule can typically be run in seconds; thus, very large-scale applications are possible.
Predictions are supported with a score, a pose and quality metrics.
Structurally novel molecules are often well within the domain of applicability, accurately supporting scaffold-hopping.

QuanSA benchmarking vs FEP+

Schindler 2020 and Abel 2015 FEP+ comparison

A critical application is to accurately predict affinities for future molecules. QuanSA and FEP+ models were built and evaluated² for sixteen targets from two published datasets using temporal segregation. Training set compounds were selected based upon similarity to the FEP+ reference ligand, forcing the QuanSA models to extrapolate. The study compared the accuracy across the targets, as summarised in the plots below.

QuanSA and FEP+ have similar accuracy.
Both methods are highly synergistic; a hybrid (mean) score increases accuracy compared to either method.
QuanSA is ~1000x faster than FEP+, alleviating screening bottlenecks.

QuanSA project application

Active learning to identify a mimic of a macrocyclic natural product

Scaffold replacement as part of an optimisation process is a complex challenge. Using a data set of ~1,100 time-stamped compounds, we applied an iterative procedure to refine a QuanSA model, starting with a macrocyclic natural product lead (UK-2A), and rapidly identify a non-macrocyclic fully synthetic broad-spectrum crop anti-fungal (FPX)³.

Active learning to identify mimic of a macrocyclis natural product

Iterative model refinement efficiently guided candidate selection to the desired product.
FPX was identified in round 5 as one of the most active predicted molecules
The model effectively learned the non-macrocyclic scaffold.
Only 100 molecules were selected vs over 1,000 in the project, representing a 10x improvement in efficiency.

Conclusions

QuanSA builds physically realistic causal models based on ligand structures alone.
QuanSA and FEP+ are equivalent in accuracy and synergistic, but QuanSA is ~1000x faster and has a broader domain of applicability.
Active learning with QuanSA enables more efficient lead-to-candidate design – 10x in this case study.

References

Cleves, A.E. and Jain, A.N. (2018). JCAMD, 32, 731-757 doi.org/10.1007/s10822-018-0126-x
Cleves, A.E., Johnson, S.R., and Jain, A.N. (2021). JCIM, 61, 5948-5966 doi.org/10.1021/acs.jcim.1c01382
Cleves, A.E., Jain, A.N., Demeter, D.A., et al. (2024). JCAMD 38, 19 doi.org/10.1007/s10822-024-00555-3