Virtual screening is often powerful because it is fast (i.e., cheap) and can give new insights into your project much more quickly than traditional experimental assays. Virtual screening can examine orders of magnitude more compounds than traditional screening, so you can explore more ideas and avoid missed opportunities. In what follows, we’ll explore some strategies to help you get the most value from your compound screening experiments.
Step 1: Define success before you start
Before you start a virtual screening experiment, take a few minutes (or more!) to plan your work. It’s all too easy in our world of overwhelming data to “calculate first, think second.” Unfortunately, that approach often leaves you with mountains of data that are only marginally useful and then you have to decide how to sort, filter, and interpret it, or else you end up going back to re-do the experiment. My undergraduate research advisor used to say, “Many students find that three months in the lab can save them three days in the library.” His point is that we often neglect, to our detriment, to spend a little bit of time planning in order to avoid long, expensive delays in implementation.
Here are a few questions to guide your reflection before you start the experiment:
What is the goal of your experiment? Do you want to identify new functionality of interest around an existing template or scaffold structure, or are you looking to identify entirely new chemotypes and move into a different area of chemical space?
What will you do with the results of your experiment? How many compounds are you willing to progress to experimental testing? How will you prioritise compounds identified from the screen? What does your ideal successful drug look like? What properties will it have?
Knowing what you want to do with your results will inform the design of your experiment.
Step 2: Plan your virtual screening experiment(s) around your available data and your objectives for success
Methods for virtual screening fall broadly into two categories: ligand-based and structure-based. Structure-based screening experiments take advantage of available protein structural information and often involve docking ligands into a known binding pocket. Ligand-based approaches typically build a pharmacophore or binding hypothesis from the 3D structures or binding poses of multiple compounds known to show good activity at the target and then quantify how well the virtual compounds align with the pharmacophore or binding hypothesis.
For ligand-based screening, build your virtual screening template using a handful of ligands with different shapes and structures. Depending on the desired objectives, you will choose compounds that are more or less similar to one another. If you want to explore functionality around an existing scaffold or move to a well-defined new scaffold, you will choose compounds that contain that scaffold as the core and which are consequently somewhat similar to each other. If your goal is to explore new scaffolds or move into areas of chemistry space that have not yet been explored in your project, you will choose compounds with a wider variety of shapes and structures.
For structure-based screening, you will need to characterise the protein pocket and make sure that the structure(s) you choose provide an appropriate representation of the pocket. Often, there are no structures available for a target of interest or cases where there is only a single structure with a bound ligand. You will want to plan how to appropriately account for the uncertainty in your docking results that arises from the incomplete representation of the binding pocket, particularly if your goal is to move to a larger or smaller scaffold than the known ligand(s).
The best method for your experiment will depend on what data you have available about your target and how much computational time or power you want to invest in screening. As a general rule, ligand-based approaches are faster than structure-based (particularly molecular docking) experiments; on the other hand, by incorporating explicit information about the shape and volume of the binding pocket, structure-based methods can often provide better enrichment for your library. Using a combination of both ligand- and structure-based methods often gives the most reliable results.
Step 3: Perform your experiment(s)
At this point, you’ve thought about the desired outcome of your experiment, considered carefully the data available to guide your screen, and decided how you want to use the results. You’ve designed your experimental protocol to identify the results of greatest interest. Now, you’re ready to take advantage of the power of the virtual screening experiment. It is important to remember that virtual screening is an assay done on a computer. Like any other assay, you should expect to see false positives and false negatives. Your experimental protocol will determine the nature and quality of the result you get. Investing time to prepare the experiment well (e.g. choosing or sampling appropriate conformations of the binding site before a structure-based screen) can help you improve the quality of your results and avoid missed opportunities. Like any experiment, you will likely need to run multiple assays to get a good understanding of the system.
When possible, using a combination of ligand- and structure-based approaches will typically yield the best results. The consensus between the different approaches often gives better enrichment than either approach on its own. This paper provides a detailed comparison of the performance of different methods against a variety of different targets from the DUD-E database.
After your virtual screening experiment, you will narrow your compound library down to several hundred compounds that will be synthesised or purchased and tested experimentally. Be aware that most methods for virtual screening will have significant uncertainty associated with the predictions, and virtual screening results must not be interpreted as providing a quantitative prediction of affinity. The key metric for interpreting and assessing the results of a screening experiment is enrichment: does the virtual screen successfully prioritise compounds that are most likely to be experimentally active?
At Optibrium, we emphasise the importance of working purposefully on multi-parameter optimisation (MPO) to predict which compounds are most likely to succeed on your project. MPO methods play a critical role in identifying your best candidates for synthesis, and we urge you to incorporate measurement uncertainty as part of your MPO. Effective MPO approaches enable you to prioritise compounds with the best chance of success on your project and reject compounds that are confidently predicted to fail to meet one or more of your objectives.
Appendix: Available resources
Many resources exist that can help you with the design and execution of virtual screening experiments. In addition to pre-generated conformations of vendor libraries, you might also explore the ZINC database, the Distributed Structure-Searchable Toxicity (DSSTox) Database, and/or the NCI Diversity Set.
We also provide some examples to help you learn how to use StarDrop to perform a virtual screen and analyse the results. To explore the considerations around virtual screening in more detail, watch our CEO and our former Director of Computational Chemistry discuss this topic in our webinar.
Think StarDrop could be right for you?
We would love to show you how StarDrop can help your drug discovery projects. Complete the form and a member of the team will get in touch to arrange a personalised demo for you.
Complete the form to request a personlised demo
Daniel Barr, PhD
Daniel is a Senior Application Support Scientist at Optibrium, where he applies his expertise in computational chemistry and statistics to support advancements in medicinal chemistry. With a passion for utilising data effectively and accounting for measurement uncertainty, Daniel fosters meaningful insights to drive innovation in drug discovery.
He holds a Ph.D. in Chemistry from Arizona State University and a B.S. in Biochemistry from the Barrett Honors College, graduating with honors and Phi Beta Kappa recognition.
Linkedin
We introduce a new method for rapid computation of 3D molecular similarity that combines electrostatic field comparison with comparison of molecular surface-shape and directional hydrogen-bonding preferences (called “eSim”).
What comprises large molecules? When we talk about “large molecules,” we often think of biologics like monoclonal antibodies, proteins, and…
StarDrop — A Swiss Army knife for drug discovery It’s designed to fit right in with the other tools you…