This poster was presented by Nick Foster at the 11th International ISSX Meeting, Busan, Korea in June 2016
In this section we post selections of work that the Optibrium team and others have presented or published.
We don't have an automatic way for you to upload your own articles to this section but if you have any publications or presentations you think might be of interest to other users (it doesn't have to be about Optibrium's products) then please get in touch and we'll help get it posted here for you.
This poster was presented by Nick Foster at the 11th International ISSX Meeting, Busan, Korea in June 2016
Peter Hunt gave this presentation at the ACS Fall 2016 National Meeting & Exposition held in Philadelphia, USA.Abstract
We describe the development of quantitative structure activity relationship (QSAR) models based on activity data from the ChEMBL database, to predict the interaction of compounds with protein targets associated with adverse outcome pathways and toxicities. However, systemic exposure to a compound will also result in the formation of metabolites, which themselves may be the cause of a toxic response. Therefore, we have developed an integrated system linking models that predict the enzymes responsible for metabolism of a parent compound and the resulting metabolites with QSAR models of target interactions. The combination of these models can predict potential toxicities resulting directly or indirectly from exposure to the parent compound. The initial implementation is focused on metabolism by Cytochrome P450 enzymes, but forms a framework that may be extended to other metabolic pathways and additional QSAR models of toxicity
This recently submitted preprint describes the underlying methods, validation and example applications of the most recent models of Cytochrome P450 metabolism in StarDrop's P450 module.
Ed Champness gave this presentation at the ACS Spring National Meeting & Exposition held in San Diego, USA on 16th April 2016.Abstract
A quantitative structure-activity relationship (QSAR) model is a mathematical function of molecular descriptors. The parameters of this function are found by maximizing the fit of this function to the observed activities of a training set of compounds, using a statistical or machine learning method. Following validation of the resulting model, most methods for estimation of the uncertainty in a prediction focus measures of the ‘domain of applicability’ or ‘distance to model’ to identify new compounds that differ significantly from the training set and hence for which the confidence in a prediction will be low.
Matt Segall gave this presentation at the ACS Fall 2015 National Meeting & Exposition held in Boston, USA on 16th August 2015.Abstract
Predicting the interaction of compounds with targets associated with toxicity can provide inputs to hierarchical models integrating systems toxicology, physiologically-based pharmacokinetic (PBPK) models and organ simulations to predict compound interactions with adverse outcome pathways (AOP).
For example, MRP4 (Multi-drug resistance-associated protein 4 or ABCC4) mediates the transport of signalling molecules (such as cAMP and cGMP), prostaglandins and leukotrienes (PGE1, PGE2, LTB4) and can be inhibited by drugs such as Celecoxib, Probenecid, MK-571 and Sulfinpyrazone . BSEP (Bile salt export pump or ABC11) is localised in the cholesterol rich canalicular membranes of hepatocytes and its function is to eliminate unconjugated/conjugated steroidal acids from the hepatocyte into the bile. The loss of this transporter function is seen in the genetic disease progressive familial intrahepatic cholestasis type 2. Inhibition of both of these transporters MRP4 and BSEP has been identified as a risk factor in the development of cholestatic DILI (drug-induced liver injury) .
We have used the publically available data from ChEMBL to build categorical and continuous quantitative structure-activity relationship (QSAR) models in order to determine the molecular properties which contribute to activity at these transporters and compare these features with known hepatotoxic compounds. We have compared the results from these models with predictions from the Derek Nexus approach for knowledge-based prediction of hepatotoxicity . The resulting QSAR models, along with models of other toxicity-related targets, will form part of a hierarchy of molecular-, systems- and physiologically-based models to identify compounds with an increased risk of toxicity as part of the HeCaToS project .
You can download this presentation as a PDF.
 Russel, F.G. et al. Trends Pharmacol. Sci. 29(4) pp. 200-7 (2008)
 Kis, E. et al. Toxicol. in Vitro. 26(8), pp. 1294-9 (2012)
 Greene, A. et al. SAR and QSAR in Environmental Research 10(2-3), pp. 299-314 (1999)
Read the presentation "Quantum mechanical models of P450 metabolism to guide optimization of metabolic stability" from the Webinar on June 17, 2015. In this Jon Tyzack described the methodology underlying StarDrop P450 models and presents two case studies to demonstrate their applicability to drug discovery projects.
The computer-based prediction of metabolites based upon structure has a wide number of applications – from a chemist’s desire to tune the metabolic profile of a lead, or a biologist’s requirement to predict likely toxic metabolites, to an analyst’s need to assign a peak in a bio-sample. Expert systems can provide transparent predictions with commentary and support based upon human knowledge whereas machine learning approaches are able to absorb new data more quickly but frequently show poor interpretability through the choice of descriptors and/or the model building methodology. However, these approaches are not mutually exclusive and the combination of both offers the potential for a new range of powerful predictive systems.
This talk will describe the science and results behind our work to apply machine learning approaches in order to enhance predictions made from the extensive biotransformation rule-base found within Meteor Nexus.
You can download this presentation as a PDF.
One of the major concerns in modern drug discovery and development is chemical and physical stability of small molecule pharmaceuticals. Chemical stability is crucial for compounds at all stages of pharmaceutical R&D, from early drug discovery to formulation of liquid or solid dosage forms. Physical stability is typically related to stability of the pharmaceutical solid form.
QSPR models of oxidative chemical stability were built based on a large data set of electrochemical measurements. In addition a quantum chemical approach was proposed for oxidative chemical stability ranking of small organic molecules. Examples of the models application to pharmaceutical compounds will be discussed.
A typical physical instability issue of solid pharmaceuticals is related to a hydrate formation during formulation or product shell life. Transformation from anhydrate to hydrate solid form can have a significant impact on product performance and may also lead to a chemical instability. A model describing propensity of an API solid form to hydrate formation will be presented. In addition a rational coformer selection to enhance hydration stability of a co-crystalline form of API at a high relative humidity will be discussed.
You can download this presentation as a PDF.
Matt gave this presentation at "Drug Discovery USA 2015 - Advances in Drug Discovery and Design"
In this presentation we will describe recent developments to a method for predicting Cytochrome P450 metabolism that combines quantum mechanical (QM) simulations to estimate the reactivity of potential sites of metabolism on a compound with a ligand-based approach to account for the effects of orientation and steric constraints due to the binding pockets of different P450 isoforms. The resulting models achieve accuracies of 85-90% on independent test sets across multiple P450 isoforms. While valuable, predicting the relative proportion of metabolite formation at different sites on a compound is only a partial solution to designing more stable compounds. The advantage of a QM approach is that it provides a quantitative estimate of the reactivity of each site, from which additional information can be derived regarding the vulnerability of each site to metabolism in absolute terms. One such measurement is the site lability, which is a measure of the efficiency of the product formation step and an important factor influencing the rate of metabolism. We will illustrate how this provides valuable guidance to redesign compounds and overcome issues due to rapid P450 metabolism, using practical examples from lead optimisation projects.
Jon Tyzack presented this poster at the joint ISSX/JSSX North America meeting in October 2014.
Optibrium™, as part of the European HeCaTos project, has developed enhancements to its P450 module within StarDrop™. These include the modelling of epoxidation pathways and the capability to model the xenobiotic metabolism of an additional 4 P450 isoforms: 1A2, 2C19, 2E1 and 2C8.
The goal of the HeCaTos project is to develop integrative approaches towards highly predictive human safety assessment. The prediction of xenobiotic metabolism is an important step in this process, giving the ability to identify potential toxic metabolites. The new capability to model the formation of reactive epoxide metabolites is a vital part in this process and coupled with our new isoform specific models it enables a more complete picture of xenobiotic metabolism to be developed.
More than half a century has passed since Drs. Hansch and Fujita proposed a general approach to the formulation of QSAR in 1961. Their approach (Hansch-Fujita analysis) has provided a new perspective for chemical-biological interactions as well as a number of successes in drug discovery. Now it is time to develop a new and promising approach based on their QSAR with the aid of modern, powerful molecular calculations.
We have proposed a novel QSAR procedure called Linear Expression by Representative Energy terms (LERE)-QSAR involving molecular calculations such as an ab initio fragment molecular orbital (FMO) and QM/MM (ONIOM) ones. The first assumption made in formulating the LERE-QSAR equationis that the free-energy terms comprising the overall free-energy change (DGobs) associated with complex formation are all additive (DGobs = DGbind + DGsol + DGothers). DGbind and DGsol are the intrinsic binding interaction free-energy of a ligand with a protein, and the solvation free-energy change associated with complex formation, respectively. DGothers, the sum of free-energy terms other than representative free-energy terms, DGbind and DGsol, is assumed to be linear with that of representative free-energy terms (DGothers = β (DGbind + DGsol) + const, b < 0). The third assumption is an empirical relation between entropic and enthalpic energy changes accompanied with complex formation (TΔS = α DH + const, a > 0). DGsol is replaceable by its dominant polar contribution DGsolpol, and most of DGsolpol comes from the enthalpic contribution. Combining the above three equations yields the following concise expression,
DGobs = g (DEbind + DGsolpol) + const [g = (1 – a) (1 + b)]
where DEbind is computable using ab initio MO calculations such as FMO and ONIOM, and DGsolpol is with continuum solvation models such as GB (generalized Born), PB (Poisson−Boltzmann), and PCM (polarizable continuum model).
We have demonstrated that the LERE-QSAR procedure can excellently reproduce DGobs associated with complex formation of a series of ligands with a protein (carbonic anhydrase, MMP, influenza and human neuraminidases).
We will also discuss newly introduced two approaches for estimating the representative energy terms; (1) hybrid estimation of PCM and GB/PB for DGsolpol and (2) dispersion−corrected Hartree-Fock method (HF−D) for DEbind.
Matched Molecular Pair (MMP) analysis has become popular as a data driven idea generator for lead optimisation. Existing SAR data is mined for single point changes in structure and their effects on activity: changes that consistently have little effect on activity indicate potential bioisosteric replacements. An adjunct to this approach is to examine the existing SAR on a project to find ‘activity cliffs’, regions where large changes in activity are observed for relatively small changes in structure. However, these methods almost exclusively rely on studying the 2D structures of the molecules concerned rather than the 3D conformation that is involved in binding.
In this paper we will present our research into using 3D methods to detect and interpret activity cliffs. We will show that considering the shape and especially the electrostatic environment around a pair of molecules results in a richer more informed view of the factors causing changes in activity and a hypothesis driven understanding of existing SAR.
David Watson gave this presentation at the "Addressing toxicity in drug discovery" workshop during the ACS National Spring Meeting 2013
Toxicity of drug candidates is a major cause of expensive, late stage failure in pre-clinical and clinical development. In this presentation, David introduced Lhasa's world-leading technology for knowledge-based prediction of key toxicities. Using data from published and donated (unpublished) sources, their Derek Nexus tool identifies structure-toxicity relationships that alert users to the potential of compounds causing toxicity.
A copy of David's slides is available as a PDF file.
This talk was presented by Dr Alexey Zakharov at the American Chemical Society 2012 Fall Meeting in Philadelphia.
The most important factor affecting metabolic excretion of compounds from the body is their half-life time. This provides an indication of compound stability of, e.g., drug molecules. We report on our efforts to develop QSAR models for metabolic stability of compounds, based on in vitro half-life assay data measured in human liver microsomes (HLM), taken from literature and several commercial or free databases. A variety of QSAR models generated using different statistical methods and descriptor sets implemented in both open-source and commercial programs were analyzed. The models obtained were compared using several external validation sets from public and commercial data sources. We also report on our use of the most predictive ones among the models for calculating the HLM half-life time as a predictor of the metabolic stability for about 250,000 compounds from the publicly available Open NCI database. These predictions are being made available freely to the scientific community.
You can download the slides as a PDF.
Matt gave this presentation at the ACS National Spring Meeting 2012.
Automatic QSAR model building methods are now readily available and have been successfully applied to a range of compound properties (solubility, logP, Blood-brain barrier penetration etc.). However, good ligand-based models of target potency are less common and have traditionally been dependent on the application of 3D and structure-based approaches. However, the availability of good public-domain data sources and the latest machine learning techniques in an automated framework have enabled us to carry out a comprehensive study of the potential for building 2D ligand-based models of target potency. We will discuss the results of automatically applying multiple QSAR model building techniques (PLS, Radial Basis Functions, Gaussian Processes, Random Forests) to over 70 data sets across a range of target classes using data from the ChEMBL database. We will explore the effects of the quality and quantity of data, modelling method and model domain of applicability on the accuracy of prediction.
Matt presented this poster at ISSX in October 2011.
Whether compounds are intended as drugs, cosmetics, agrochemicals or for other industrial application, it is essential to understand their potential to cause toxic effects. This can guide the prioritisation of compounds for further research or consideration of the most appropriate downstream experiments to confirm their safety. The ability to predict toxicities based on chemical structure alone would allow these factors to be considered prior to synthesis, allowing the safest options to be pursued and saving time and resources wasted on synthesis and testing of unsuitable compounds.
Matt presented this poster at ISSX in October 2011.
Many computational methods have been developed that predict the regioselectivity of metabolism by drug metabolising isoforms of the Cytochrome P450 class of enzymes (P450) [1-5]. Here we describe recent developments to a method for predicting P450 metabolism that combines quantum mechanical (QM) simulations to estimate the reactivity of potential sites of metabolism on a compound with a ligand-based approach to account for the effects of orientation and steric constraints due to the binding pockets of different P450 isoforms.
Dr Terry Stouch, Consulting in Drug Discovery and Design Practice, Technologies, Process at Princeton, NJ and Duquesne University gave this presentation on "In silico ADME/Tox: Why models fail: Why models work", in June 2010.
By way of example, we discuss the apparent "failure" of in silico ADME/Tox models and attempt to understand the causes. Often,the interpretation of the success of models lies in their use and the expectations of the user. Other times, models are, in fact, of little value. Disappointing results can be linked to the key aspects of the model and modeling procedure, many of these related to the original data and its interpretation. We make recommendations to providers of models regarding the development, description, and use of models as well as the data and information
Terry gave this presentation at the 32nd National Medicinal Chemistry Symposium, June 6-9th, 2010, Minneapolis, MN, USA.
A copy of Terry's slides is available as a PDF file.
This paper was published by Olga Obrezanova and Matthew D. Segall, Journal of Chemical Information and Modeling, 2010, 50 (6), pp 1053–1061.
In this article, we extend the application of the Gaussian processes technique to classification quantitative structure−activity relationship modeling problems. We explore two approaches, an intrinsic Gaussian processes classification technique and a probit treatment of the Gaussian processes regression method. Here, we describe the basic concepts of the methods and apply these techniques to building category models of absorption, distribution, metabolism, excretion, toxicity and target activity data. We also compare the performance of Gaussian processes for classification to other known computational methods, namely decision trees, random forest, support vector machines, and probit partial least squares. The results indicate that, while no method consistently generates the best model, the Gaussian processes classifier often produces more predictive models than those of the random forest or support vector machines and was rarely significantly outperformed.
You can donwload it via the ACS "Articles on Request" e-print service using the following link:
Please note: To access the Articles on Request link, please log in to the Publications website using your ACS ID. If you do not have an ACS ID, you will need to Register for one for free by clicking on 'Register' near the top right corner of the website.
Young Shin and his colleagues at Genentech presented this poster at the ISSX North American Regional meeting in Baltimore, MD in October 2009.
COMPARISON OF METASITE AND STARDROP PREDICTION OF CYP3A4, CYP2C9 AND CYP2D6 V. Sashi Gopaul, Young Shin, Hoa Le, Matthew Baumgardner, Cornelis Hop and Cyrus Khojasteh Drug Metabolism & Pharmacokinetics, Genentec, Inc, South San Francisco, CA, USA, 94080
Metabolite identification studies play an important role in determining the sites of metabolic liability of new chemical entities (NCEs) in drug discovery. However, generating these complex and detailed studies in a highthroughput environment is often a challenge. Therefore, the use of in silico tools that can predict the sites of metabolism of an NCE could enhance the drug design process. In this study we compare the utility of MetaSite and Stardrop, two predictive softwares available for this purpose...
This article was published in QSAR & Combinatorial Science, Volume 25, Issue 12, Pages 1172 - 1180 (DOI
In this article, we review recent developments in the prediction of Absorption, Distribution, Metabolism, Excretion and Toxicity (ADMET) properties by Quantitative Structure – Activity Relationships (QSAR). We consider advances in statistical modelling techniques, molecular descriptors and the sets of data used for model building and changes in the way in which predictive ADMET models are being applied in drug discovery. We also discuss the current challenges that remain to be addressed. While there has been progress in the adoption of non-linear modelling techniques such as Support Vector Machines (SVM) and Bayesian Neural Networks (BNNs), the full advantages of these "machine learning" techniques cannot be realised without further developments in molecular descriptors and availability of large, high-quality datasets. The largest pharmaceutical companies have developed large in-house databases containing consistently measured compound properties. However, these data are not yet available in the public domain and many models are still based on small "historical" datasets taken from the literature. Probably, the largest remaining challenge is the full integration of predictive ADMET modelling in the drug discovery process. Until in silico models are applied to make effective decisions in a multi-parameter optimisation process, the full value they could bring will not be realised.
This is a preprint of the article published in J Comput Aided Mol Des. 2008 Jun-Jul;22(6-7):431-40. Epub 2008 Feb 14.
In this article, we present an automatic model generation process for building QSAR models combined with Gaussian Processes, a powerful machine learning modeling method. We describe the stages of the process that ensure models are built and validated within a rigorous framework: descriptor calculation, splitting data into training, validation and test sets, descriptor filtering, application of modeling techniques and selection of the best model. We apply this automatic process to data sets of blood-brain barrier penetration and aqueous solubility data sets and compare the resulting automatically generated models with ‘manually’ built models using external test sets. The results demonstrate the effectiveness of the automatic model generation process for two types of data sets commonly encountered in building ADME QSAR models, a small set of in vivo data and a large set of physico-chemical data.
The rapid design-test-redesign cycles of modern drug discovery and the demand for fast model (re)building whenever data becomes available have given rise to a trend to develop computational algorithms for automatic model generation. Automatic modelling processes allow computational scientists to explore large numbers of modelling approaches very efficiently and make QSAR/QSPR model building accessible to non-experts.
This poster was displayed at MedChem ADMET Eurpoe, 2008.
In silico predictive models are now widely used to predict a range of molecular properties and help prioritise molecule for synthesis. However, a common criticism often levelled at predictive models is that they offer few clues regarding why a molecule is predicted to have a certain property. By definition, models encode relationships between molecular structure and properties, but interpreting and visualising this information to design better molecules has been almost impossible. This is particularly true of models built with modern ‘machine learning’ techniques such as artificial neural networks (ANN), Gaussian processes (GP) or support-vector machines (SVM). The models that these techniques create have commonly been described as ‘black box.’
This poster was presented at the MedChem Europe meeting, 2007.
In this presentation Olga Obrezanova describes an automated process for building QSAR models (now available as part of StarDrop as the Auto-Modeller!). Olga goes on to demonstrate the effectiveness of the process by carrying out comparisons of this technique with traditional "hand-on" modelling approaches for blood-brain barrier penetration and aqueous solubility.
This presentation was given at the Zing Computational Chemistry Conference in March 2009.
In this article Olga describes how we extend the application of Gaussian Processes technique to classification problems. We explore two approaches, an intrinsic Gaussian Processes classification technique and a probit treatment of the Gaussian Processes regression method. Here we describe the basic concepts of the methods and apply these techniques to building category models of blood-brain barrier penetration and hERG inhibition. We also compare performance of Gaussian Processes for classification to other known computational methods, namely decision trees, bagging and probit PLS.
In this presentation Olga Obrezanova talks about Gaussian Processes - a powerful computational method for QSAR modelling. Olga starts by describing the main ideas of this technique.
We are mostly interested in the application of this technique to predictive modelling of ADME properties. The importance of optimising ADME properties of potential drug molecules is now widely recognised. Considering the ADME properties early in the drug discovery process can reduce the costs of the drug development and decrease the attrition rate of drug candidates. We have developed new techniques for finding parameters of the Gaussian Processes method which I will present. I will also show examples of application of these techniques to ADME and QSAR datasets and compare Gaussian Processes methods with other known techniques. The demand of modern drug discovery for fast model (re)building whenever new data becomes available gave rise to a trend to develop computational algorithms for automatic model generation. I will demonstrate how we use Gaussian Processes in an automatic modelling process. (The purpose of such algorithms is to save scientists' time, explore more modelling possibilities and make the process of QSAR model building accessible to non-experts.)
This presentation was given at the American Chemical Society conference in Boston, 2007.