How should I prepare and store my data for cheminformatics applications?
Structuring your cheminformatics data First, the easiest format to work with is a simple table of data, where each row…
Using the DUD-E+ benchmark, we explore the impact of using a single protein pocket or ligand for virtual screening compared with using ensembles of alternative pockets, ligands, and sets thereof. For both structure-based and ligand-based approaches, the precise characterization of the binding site in question had a significant impact on screening performance. Using the single original DUD-E protein, Surflex-Dock yielded mean ROC area of 0.81 ± 0.11. Using the cognate ligand instead, with the eSim method for screening, yielded 0.77 ± 0.14. Moving to ensembles of five protein pocket variants increased docking performance to 0.84 ± 0.09. Results for the analogous ligand-based approach (using the five crystallographically aligned cognate ligands) was 0.83 ± 0.11. Using the same ligands, but making use of an automatically generated mutual alignment, yielded mean AUC nearly as good as from single-structure docking: 0.80 ± 0.12. Detailed results and statistical analyses show that structure- and ligand-based methods are complementary and can be fruitfully combined to enhance screening efficiency. A hybrid approach combining ensemble docking with eSim-based screening produced the best and most consistent performance (mean ROC area of 0.89 ± 0.08 and 1% early enrichment of 46-fold). Based on results from both the docking and ligand-similarity approaches, it is clearly unwise to make use of a single arbitrarily chosen protein structure for docking or single ligand query for similarity-based screening.
Structuring your cheminformatics data First, the easiest format to work with is a simple table of data, where each row…
My hope is that these posts will be of interest to people who want to understand more of the nuts…
What are StarDrop and Semeta? Semeta is a tailored platform for DMPK scientists. It enables users to address key challenges…