J. Med. Chem, publish; Experimental Validation of Predictive Models in a Series of Novel Antimalarials.
With hit identification and lead optimisation being key steps in the development of any new drug, the continued advancements in machine learning and artificial intelligence approaches possess significant promise to streamline this process, which would result in more efficient medicinal chemistry campaigns.
Edwin G. Tse, Laksh Aithani, Mark Anderson, Jonathan Cardoso-Silva, Giovanni Cincilla, Gareth J. Conduit, Mykola Galushka, Davy Guan, Irene Hallyburton, Benedict W. J. Irwin, Kiaran Kirk, Adele M. Lehane, Julia C. R. Lindblom, Raymond Lui, Slade Matthews, James McCulloch, Alice Motion, Ho Leung Ng, Mario Öeren, Murray N. Robertson, Vito Spadavecchio, Vasileios A. Tatsis, Willem P. van Hoorn, Alexander D. Wade, Thomas M. Whitehead, Paul Willis, and Matthew H. Todd*
Abstract – The Open Source Malaria (OSM) consortium is developing compounds that kill the human malaria parasite, Plasmodium falciparum, by targeting PfATP4, an essential ion pump on the parasite surface. The structure of PfATP4 has not been determined. Here, we describe a public competition created to develop a predictive model for the identification of PfATP4 inhibitors, thereby reducing project costs associated with the synthesis of inactive compounds. Competition participants could see all entries as they were submitted. In the final round, featuring private sector entrants specializing in machine learning methods, the best-performing models were used to predict novel inhibitors, of which several were synthesized and evaluated against the parasite. Half possessed biological activity, with one featuring a motif that the human chemists familiar with this series would have dismissed as “ill-advised”. Since all data and participant interactions remain in the public domain, this research project “lives” and may be improved by others.
Introduction – Efficiency in the early stages of the drug discovery pipeline, from hit identification to lead optimization, is key to the development of new drugs. The initial identification of a hit compound is typically carried out using one of two approaches. In target-based drug discovery, the molecular target of interest is known. (1) With this knowledge, libraries containing many compounds are screened (experimentally or computationally) against the known target to identify promising candidates or chemical scaffolds for further development. Through testing these chemicals, the key binding interactions may be identified and more directed structure–activity relationship (SAR) studies can be conducted to optimize activity.
Alternatively, if the biological target is not known, phenotypic drug discovery may be undertaken. (2) This process involves the initial identification of potent compounds that give rise to the desired effect (e.g., inhibition of cell growth), with target determination performed thereafter. The lead-optimization phase in this type of drug discovery is less streamlined than that in the former method as it is conducted without guidance from target binding interactions and often relies upon the intuition of the medicinal chemist to design and synthesize compounds to explore the SAR. There are a number of obvious limitations to this approach, including the personal bias/imagination of the scientist or the availability/cost of resources. As a result, good hypotheses or key insights may be overlooked, which can lengthen the time taken to identify a lead candidate and increase costs associated with synthesizing complex molecules that are later revealed to be inactive. Nevertheless, the advantage of phenotypic drug discovery, which underpins its popularity, is that hit or lead compounds are already known to be effective in their overall role (e.g., the killing of a pathogen).