Software downloads and documentation
StarDrop and the optional ADME QSAR, Nova, BIOSTER, Inspyra, SeeSAR, Surflex eSim3D and MPO Explorer modules will all run on a desktop or laptop computer on both Windows and macOS. Simply download and run the appropriate installer for your computer.
StarDrop for Windows
StarDrop for macOS
A StarDrop server is only required to support the optional Metabolism and Auto-Modeller modules or if you have opted for a floating license. A Derek Nexus web service is only required to support the optional Derek Nexus module. An optional model server can also be installed to run the ADME QSAR models, models built with the Auto-Modeller or custom models. The StarDrop servers run on a 64-bit Linux OS (Red Hat Enterprise Linux 8. Ubuntu 22.04 LTS and CentOS version 7.x).
If you are an existing customer migrating from a legacy server, it is essential that you connect with a member of our support team who will supply all the information you need to ensure a smooth transition.
Red Hat Enterprise Linux servers
Ubuntu servers
Legacy Derek Nexus web service
Legacy CentOS servers
When the installation is complete, run StarDrop and it will start in ‘viewer’ mode. To fully enable StarDrop, please select Manage Licenses from the Help menu and click the Request License Key button to send the Machine Identifier displayed (an 8 character string) to Optibrium Support. We will return a license key to enable StarDrop as soon as possible with instructions on how to install it and start using StarDrop.
StarDrop user guide
StarDrop reference guide
StarDrop system guide
StarDrop scripting and customisation guide
You can verify the integrity of the StarDrop installer files by checking the MD5, SHA-1 or SHA-256 hashes.
Here you can find the Cerella system guide and data preparation information.
Cerella API (main interface)
Cerella API (authentication, authorisation and accounting interface)
Licensed users may download software distributions and updates below. Software distributions include binary executables for Windows, Linux, and Mac platforms.
The modules of the BioPharmics Platform are being integrated with multiple modeling front-ends to enhance ease-of-use.
Software distributions are gzip compressed tar archives. Installation involves unpacking the archive and ensuring that your licensing file is in the right place. The BioPharmics manual has detailed instructions for installation and extensive examples of use of all four modules.
Please refer to BioPharmics Release Notes for a summary of changes for major/minor version updates.
BioPharmics v5.111
BioPharmics v5.1
StarDrop desktop client
StarDrop and the optional ADME QSAR, Nova, BIOSTER, Inspyra, SeeSAR, Surflex eSim3D and MPO Explorer modules will all run on a desktop or laptop computer on both Windows and macOS. Simply download and run the appropriate installer for your computer.
StarDrop for Windows
StarDrop for macOS
StarDrop servers
A StarDrop server is only required to support the optional Metabolism and Auto-Modeller modules or if you have opted for a floating license. A Derek Nexus web service is only required to support the optional Derek Nexus module. An optional model server can also be installed to run the ADME QSAR models, models built with the Auto-Modeller or custom models. The StarDrop servers run on a 64-bit Linux OS (Red Hat Enterprise Linux 8. Ubuntu 22.04 LTS and CentOS version 7.x).
If you are an existing customer migrating from a legacy server, it is essential that you connect with a member of our support team who will supply all the information you need to ensure a smooth transition.
Red Hat Enterprise Linux servers
Ubuntu servers
Legacy Derek Nexus web service
Legacy CentOS servers
StarDrop licensing and documentation
When the installation is complete, run StarDrop and it will start in ‘viewer’ mode. To fully enable StarDrop, please select Manage Licenses from the Help menu and click the Request License Key button to send the Machine Identifier displayed (an 8 character string) to Optibrium Support. We will return a license key to enable StarDrop as soon as possible with instructions on how to install it and start using StarDrop.
StarDrop user guide
StarDrop reference guide
StarDrop system guide
StarDrop scripting and customisation guide
You can verify the integrity of the StarDrop installer files by checking the MD5, SHA-1 or SHA-256 hashes.
Cerella APIs and documentation
Here you can find the Cerella system guide and data preparation information.
Cerella API (main interface)
Cerella API (authentication, authorisation and accounting interface)
BioPharmics software distribution and documentation
Licensed users may download software distributions and updates below. Software distributions include binary executables for Windows, Linux, and Mac platforms.
The modules of the BioPharmics Platform are being integrated with multiple modeling front-ends to enhance ease-of-use.
Software distributions are gzip compressed tar archives. Installation involves unpacking the archive and ensuring that your licensing file is in the right place. The BioPharmics manual has detailed instructions for installation and extensive examples of use of all four modules.
Please refer to BioPharmics Release Notes for a summary of changes for major/minor version updates.
BioPharmics v5.111
BioPharmics v5.1
StarDrop release notes
Version 7.6.1 (July 2024)
New Features
Hosted StarDrop Servers:
- Support for Optibrium’s hosted services (Auto-Modeller™, Model, and Licence servers and the Derek Nexus™,web services).
Cerella™:
- Support for Cerella Single Sign-On (SSO) when accessed from the StarDrop desktop.
Derek Nexus:
- Support for HTTPS communication with the Derek Nexus web service.
Changes
Core Application:
- When importing StarDrop-generated CSV files, StarDrop now automatically associates any standard deviations with the corresponding numerical data column.
- New StarDrop logo.
Cerella:
- Improved speed of loading of Cerella models into StarDrop to provide improved support for loading large numbers of endpoints.
- Outlier flags are now retrieved from Cerella and stored as part of StarDrop data sets.
Idea Tracker:
- Idea Tracker searching has been optimised to improve the speed of molecule searches of large registries containing up to one billion ideas.
Metabolism:
- The algorithms used for predicting metabolism by aldehyde oxidase (AOX), flavin-containing monooxygenase (FMO) and uridine diphosphate glucuronosyltransferase (UGT) have been optimised to improve performance. On average, the metabolism models for these routes are now between two to three times faster.
- The metabolism server installer script now also works with docker images hosted on ECR.
Query Interface:
- The Query Interface Extension now reads outlier flags from Cerella query results so that the corresponding flags for Cerella data entries are created in StarDrop.
Bug Fixes
Core Application:
- Fixed an issue where Idea Tracker columns were duplicated after exporting/reading data sets.
- Fixed an R-Group Enumeration issue that would occasionally produce corrupted structures.
- Improved the performance when calculating Probabilistic Scoring functions with data containing factor uncertainties.
- Fixed an issue in which merging large datasets caused StarDrop to crash.
Servers:
- Improved the security of the Optibrium AWS-hosted metabolism server by ensuring authentication details are not captured in the log files.
- Removed an errant runtime security warning from the Metabolism server installer.
Query Interface:
- Fixed the QueryToolPlugin bindings for two functions where the new bindings did not correctly match the API used by the Python script.
Version 7.6 (February 2024)
New features
Added the Idea Tracker Extension:
- A central registration system for all new compound structures recorded in StarDrop, including key property and structural information.
- Allows users to trace compound designs to their origins and identify project optimisation decisions.
Core application:
- R-Group Decomposition Tool
- Added ability to define a Linker within a scaffold.
- Improved flexibility in R-Group definitions now allows more structures to be found during an R-Group Decomposition.
Card View®:
- Added the ability to have Tables (such as an MMP or Activity Neighbourhood Table) be persistent if left open, following the closing and reopening of a StarDrop project file.
- Added ability to select a compound’s immediate neighbours in a Card View network.
- Provided a button in Card View to ‘Clear Annotations, Clear Links, and Reset All Card Positions’ in a single action.
Nova™:
- Multiple scaffolds can now be defined when running a Scaffold-based library enumeration using the Add button.
- Multiple related R-Groups are automatically suggested and can be selected when running a Scaffold-based library enumeration. Sketching a fragment instantly triggers similar R-Groups to be suggested.
Changes
Core Application:
- Updated the R-Group Decomposition tool to correctly place an R-Group in its correct column in the cases of geminal disubstitution.
- When running multiple R-Group Decompositions within a dataset, the R-Group fragment IDs are maintained.
- The ‘Contact StarDrop Support’ link in StarDrop’s Help menu now launches a form on the Optibrium website.
- Updated the Query Interface (QI) to work with Idea Tracker.
- Added a ‘widgetplugins’ folder for StarDrop users in the hosted AWS environment to support StarDrop’s Custom Scripts.
- Updated to Python 3.9.18 for PC and Python 3.11.7 for macOS.
- Mopac7 was rebuilt for pKa calculations in the ADME QSAR module.
Cerella™:
- Fixed the sorting of Cerella columns to use the data type that is currently selected.
- Scoring functions that use Cerella column data will update automatically when the data type is changed by the user.
Derek Nexus:
- Derek Nexus Library updated to version 6.3.0.
Nova™:
- Updated the Matched Series Analysis knowledgebase to ChEMBL 33.
- The Sketch button in the Transformation Manager now uses the same options as for the Reaction Manager.
Servers:
- Updated both the CentOS (7.4 and above and not to include 8.x or 9.x) and Ubuntu (22.04 LTS) server installers for StarDrop 7.6.
Surflex eSim3D™:
- Integrated the ‘grindpdb’ tool into StarDrop from BioPharmics.
Bug Fixes
Core Application:
- In cases where R-Group Decompositions are performed on multiple scaffolds, now all scaffolds are correctly shown in the Scaffold column.
- Fixed the stereochemical designations when R-Group Fragments are created from a chiral centre.
- Fixed a bug in Heat Maps where the colour map was not being updated following the editing of a molecule.
- Fixed bug where fragments were coloured when Heat Map was selected in an MMP or Activity Table.
- Fixed a bug in the Scoring tab to make the Available Properties Search not case sensitive.
- Fixed the automatic refresh of the Model Display hierarchy when changed in the Preferences.
- Fixed a bug where the Preferences for “Use wildcards in text searches” was not persistent.
Cerella™:
- Fixed an issue where the Cerella login screen would appear on the wrong screen when using multiple monitors.
License Server:
- Fixed a bug that sometimes reports a valid license as outdated.
Metabolism:
- Fixed a bug where the Metabolite column sometimes disappeared when exporting the metabolite data.
- Fixed a bug with StarDrop not remembering the Metabolism server password in the Preferences.
MPO Explorer™:
- Fixed a bug that would cause a crash when customising a graph.
Nova™:
- Fixed an issue where duplicate products were enumerated from compounds containing nitro groups.
Surflex eSim3D™:
- Corrected the naming of new dataset tabs created after running a Surflex eSim3D Virtual Screen.
Visualisation:
- Fixed a bug that made editing the Fonts in a Visualisation legend difficult.
Version 7.5 (September 2023)
New Features
Metabolism – Replaces p450:
- New WhichEnzyme™ model to calculate the relative likelihood of a structure being a substrate of cytochrome P450 (P450), aldehyde oxidase (AOX), flavin-containing monooxygenase (FMO), uridine diphosphate glucuronosyltransferase (UGT) or sulfotransferase (SULT).
- New Phase I and Phase II enzyme regioselectivity models for AOX1, FMO1/FMO3, UGT1A1/UGT1A4/UGT1A9/UGT2B7 and SULT.
- New pre-clinical models (mouse/rat/dog) that predict the importance of each potential site of P450 metabolism in animal models.
- The P450 area in StarDrop has been replaced with a new, streamlined Metabolism area with easy-to-access calculation modes:
- Predict Metabolism – predicts one generation of metabolism for all enzyme models.
- Predict P450 Metabolism – predicts one generation of metabolism for all P450 models.
- Generate Metabolic Pathway – predicts a 2-generation metabolic pathway based on a structure. A heuristics-based model determines which first-generation metabolites are progressed to further metabolism prediction.
- Custom – the user can decide which models to run.
Surflex eSim3D™ – Desktop Virtual Screening:
- From the 3D tab, StarDrop users can now run fast, ligand-based virtual screening against prepared screening libraries.
- Commercially available screening collections from Enamine and MolPort™ have been prepared for virtual screening (conformations pre-computed) and are available for download for use in desktop virtual screening. The screening libraries will be regularly updated.
Core Application:
- Added the ability to edit and replace/overwrite the chemical structure in a data set row.
- The 2D chemical diagram in the data set can now be set to a preferred orientation.
- Added a context menu option to copy a structure from a data set to the clipboard within Table View and Card View®.
- Added the ability to zoom the display in Table View using the mouse wheel with a CTRL-key modifier (CMD-key on Mac).
- Added context menu options to Table View column headers and row numbers to set column widths and row heights.
- Added the ability to select a compound in Card View and then automatically expand the selection to include all compounds linked to the selected card.
- Added the ability to clean Card View (remove links, annotations, snap to grid, etc.) with a single action.
- Added a property search bar to the Function Editor dialogue to make it easy to locate columns.
Visualisation:
- Added an option in the Visualisation Preferences to toggle the selection colour applied to selected/unselected chart data.
SeeSAR™ View:
- Added the ability for users to reset to a standard orientation in the SeeSAR display.
- The camera orientation in the SeeSAR display is now saved in the StarDrop project file.
- Added the ability to export protein-ligand complexes from SeeSAR to a single PDB file.
Nova™:
- Added the ability to sketch or edit a transformation directly in the Nova Transformation Manager panel.
Changes
Core Application:
- 2D chemical diagrams in the data set can now be aligned based on structure/substructure, including wildcard matching in a new Align 2D Diagrams tool under the Data Set menu.
- The label layout for 2D chemical diagrams has been updated to avoid overlapping labels, lines crossing structures, and improving readability.
- Implicit Hydrogens without 2D coordinates are now shown in grey in the chemical diagram.
- The size of frozen rows/columns in Table View is now limited so that the main table is always visible.
- When editing a text cell, the dialogue box is now multi-line with wrapped text.
- Data set tab order in the Tabbed View and the Cascade View layout are now saved in the StarDrop project file.
- Improved user preferences interface to control font type, size, and colour in summaries and charts.
- Improvements have been made to logging preferences, including making it easy to copy the path to the log file and adding the option for StarDrop to write detailed messages to the log.
- Memory performance has been improved by limiting the undo stack to the most recent 40 actions.
- PythonQt has been replaced, meaning previously installed StarDrop scripts must be upgraded. A Python Script upgrader has been built to ensure scripts are upgraded when first run.
Nova:
- Reinstated saving of the Matched Molecular Pair results table to the StarDrop project file so that the analysis does not need to be repeated when the file is reopened.
- Added the option to include ring changes in fused rings as a matched pair.
Servers:
- In addition to CentOS, Ubuntu installers are now available for all StarDrop servers.
Bug Fixes
Core Application:
- Fixed an issue where a new file path to a folder of models set in the File Locations area of the Preferences caused StarDrop to crash.
- Fixed an issue where StarDrop failed to start after users added a new file path in the File Locations area of the Preferences to access shared custom models (e.g. *.aim files from Auto-Modeller).
- Fixed an issue where column headers that contained special characters (e.g. /) were not imported from *.csv files.
- Fixed an issue where users could not select a new heat map colouring scheme on macOS.
- Fixed an issue where heteroatoms in structures displayed on cards with coloured backgrounds were hard to read.
- Fixed an issue where heat map colouring was lost in Card View® if the user clicked on white space between cards.
- Fixed an issue where Card View performance was decreased whilst removing links between selected cards.
- Fixed an issue where applying the undo action in the Nova layout in Card View caused a crash.
- Fixed an issue where hovering the cursor over a card in Card View resulted in pixelation on Windows with >150% scaling.
- Fixed an issue with multi-scaffold R-group decomposition where only the first scaffold was recognised for subsequent enumeration.
Visualisation:
- Fixed an issue where applying a numerical filter to a scatter plot resulted in the unintended exclusion of compounds.
- Fixed an issue where radar plots appeared pixelated on Windows with >150% scaling.
- Fixed an issue where scatter plots were not automatically updating when categorical data was changed.
- Fixed an issue where column headers that contained special characters (e.g. /) were unavailable for formatting in a visualisation.
Metabolism – Formerly P450:
- Fixed an issue where P450 metabolites were not generated for some sites that were predicted to be labile.
- Fixed an issue where the orientation of a molecule in the P450 summary area was inconsistent with its orientation in Table View.
Nova:
- Fixed an issue where R-group enumeration on a molecule with many R-groups caused a crash.
- Fixed an issue where a chemistry transformation in Nova resulted in an incorrect chemical structure.
- Fixed an issue in multi-scaffold R-group analysis where View Scaffold only showed the first scaffold available.
Surflex eSim3D:
- Fixed an issue where the eSim3D prepared reference structure was not used in the calculation of eSim3D surfaces.
- Fixed an issue where the C-N bond order in the pyrrolidone moiety of staurosporine-like ligands (PDB HET IDs: STU, 4ST, 1ST, and ITQ) was assigned incorrectly.
Model Server/Auto-Modeller™:
- Fixed an issue where model metadata in *.aim files encoded in Latin-1 were not read.
- Fixed an issue where the REST service to Model Server failed when accessed from multiple clients or processes.
Version 7.4 (March 2023)
New Features
Core Application:
- Added the ability to colour StarDrop data sets based on property values (“Heat Map”)
- Added the ability to specify the size and position of structure in chart pop-ups
- Added the ability to specify default properties to be displayed in chart pop-ups
- Added the ability to synchronise the display of chart pop-ups and labels
- Added the ability to convert chart pop-ups into labels
- Added the ability to show chart pop-ups whenever compounds are selected
Scripting:
- Added the ability to edit data set entries from a script
- Added the ability to refresh a data set from a script
Query Interface:
- Added the ability to query on pre-defined SMARTS and TEXT values
- Added support for categorical data types
Changes
Core Application:
- Enhanced behaviour in charts to retain colour formatting of selected data (dimming unselected data points)
- Histograms now automatically update when a categorical value is changed
- Tooltips associated with numerical and categorical data now display standard deviations and probabilities respectively
- Added the ability to export only the best pose when saving a data set that has 3D conformations
Card View display performance has been improved when loading and clustering larger (100k compounds) data sets - Selection tool now only provides Maximin as the diversity metric
- Molecule View has been removed
- GLC-lib has been replaced by Qt3D for improved graphics in 3D charts
Nova:
- Improved Nova transformation rendering now avoids chemical diagrams where all atoms are red
- Matched series analysis wizard now automatically ticks the first knowledgebase entry only
SeeSAR:
- Updated to the latest BioSolveIT library
Scripting:
- StarDrop now automatically adds a ‘site-packages’ folder to sys.path on startup, which assists the deployment of complex custom scripts
- Pose Generation Interface now supports Python 3.10
- Hypertext links and selectable text in warning dialogues are now supported
Servers:
- Model server extensions can now be customised to specify an error message when calculations fail
Bug Fixes:
Core Application:
- Fixed an issue where SD files generated by third-party software would not open correctly
- Fixed an issue where selecting data points on a 3D chart was difficult on a high-resolution screen
- Fixed an issue where certain 3D scatter plots required higher memory consumption
- Fixed an issue where snake plot labels for selected items were created incorrectly
- Fixed an issue where minor tick marks for 3D charts were not available
- Fixed an issue where points and error bars didn’t line up on jittered and zoomed 3D charts
- Fixed an issue where text rendering was unreadable due to rendering artefacts in 3D charts on macOS
- Fixed an issue where the selection tool stalled when running large numbers of rapid iterations for a diverse selection
Nova:
- Fixed an issue where “Add to Fragment Library” and “Assign to Plate” options were greyed out in the ‘Data Set’ menu on macOS
- Fixed an issue where macrocycle templates were not being used in the Nova sketcher or when displaying reagents
- Fixed an issue where incorrect models were available in the list of properties
Servers:
- Fixed an issue that stopped licenses for the Model Server REST interface working due to time zone differences
- Fixed an issue that occasionally caused an Auto-Modeller worker process to crash
- Fixed an issue where StarDrop became periodically unresponsive when the Derek Nexus Server and License Server were configured but both unreachable
- Fixed an issue where an error message was repeated, resulting in large log files
Scripting:
- Fixed an issue in a third-party integration script where an empty string was returned for certain data types
- Fixed an issue where Python models loaded into StarDrop from a file location were not available without an ADME QSAR license
Version 7.3.2 (October 2022)
Bug Fixes
macOS:
- Fixed an issue where the bundled Python was dependent on system Python libraries
- Fixed an issue where pyodbc was unable to obtain ODBC driver details
- Fixed an issue where StarDrop crashed when working with invalid dates
Servers:
- Fixed an issue where the model server failed to load models when models generated by the Auto-Modeller were present
Version 7.3.1 (July 2022)
Bug Fixes
Chemical Structure Handling:
- Fixed an issue where stereochemistry inversions were occurring in the StarDrop Designer and when a structure was added to a dataset
- Fixed an issue around rendering of hydrogens at stereocentres
- Fixed an issue where the Show hydrogen on stereocentres preference was not being maintained
- Fixed an issue related to incorrect behaviour when sketching hydrogen
- Fixed an issue where stereochemistry was not being detected correctly for some structures, resulting in misassignment of Nitrogen
- Fixed an issue where an uncharged Nitrogen could have more than four bonds without showing a valence error
- Fixed an issue where Nova removes relative stereochemistry from result structures and parent IDs in the Chemistry Transformations workflow
Usability:
- Fixed an issue to disambiguate duplicate data in a third-party platform integration
- Fixed an issue where longer running models caused the models server to become unresponsive
- Fixed an issue where creating a custom line on a scatter plot caused StarDrop to crash
- Fixed an issue in Nova where loading some Matched Series Knowledge Bases caused StarDrop to become unresponsive
Changes
Core Application:
- EULA updated to reflect date, new products, trademarks, etc. Further, the text formatting in the StarDrop start-up was cleaned up to improve readability
SeeSAR:
- BioSolveIT libraries were upgraded to the most recent version
Version 7.3 (May 2022)
Python 2 to Python 3 upgrade
Core Application:
- Upgraded StarDrop client to Python 3
- Embedded Python 3 into the StarDrop dmg to ensure continued macOS support
- Python scripts have been ported to Python 3
- StarDrop servers (P450, Model, Auto-Modeller, Pose Generation) have been upgraded to Python 3
- Python scripts directory has been changed to C:\\Users\\<Username>\\AppData\\Roaming\\StarDrop\\py3 on Windows and /Users/<Username>/StarDrop/py3 on macOS. This change is included to ensure that existing Python 2 scripts are not overwritten during the installation process
Cerella Integration
Core Application:
- Suggested measurements functionality is disabled if no structure column exists in the data set
- Issue causing unexpected white boxes on Cerella value distribution histograms has been fixed
- An error message is now returned when incorrect credentials are entered
- Query Interface improvements:
- Fixed an issue where query plugin fails to retrieve some columns from PostgreSQL databases
- Fixed an issue where the query plugin makes an unnecessary number of Cerella login requests
- Fixed an issue where query results were not read from a NUMERIC column in PostgreSQL databases
- Fixed an issue where data source configuration for PostgreSQL databases shows no preview columns
Interface/Usability
Core Application:
- Long path awareness has been added to handle files with long filenames
- Users must now specify a fragment name when sketching fragments in Nova
- Users can now specify a StarDrop model server, but run pKa calculations on a local machine
- Users can now open multiple model (.aim) files at the same time
- A data set refresh button has been added to the right-hand toolbar
- In Card View Design Preferences, users can now resize the width of the panel so that long column names are not truncated
- The initial (default) similarity threshold has been lowered for Matched Molecular Pairs
- Fixed a crash triggered by manipulating cards and stacks of cards in Card View
- Fixed an issue where 3D coordinates were not saved to SD files in cases where rows contained a single conformer
- Fixed an issue where Nova’s Matched Series Analysis misidentified a scaffold versus a substituent
- Fixed an issue where categorical columns were not available in the Select Properties dialogue of MPO Explorer
- Fixed an issue where a persistent “read-only” message remained in the project title bar
- Fixed an issue where the 2D structure (molecular diagram) was not displayed in the data set for results returned from the pose generation interface
- Fixed errors encountered when opening an SD file:
- Files containing blank cells are now imported correctly
- V3000 files containing SMILES property values in the data block are now imported correctly
- Files that contain blank lines between M End and the first property are now imported correctly
- Fixed an issue where the “remove salts” option added rather than removed a hydrogen when neutralising a charge
BioPharmics release notes
Version 5.191 (June 2024)
Tools
- Bug fixes in fgen3d to eliminate crashes when trying to parse various badly formed SMILES (e.g. atoms with 11 bonds).
Docking
- Fixes to the PSIM command parsing to reduce brittleness in protein/ligand list specification.
Similarity
- More rigorous checks on well-formed ligand target structures in the targprep command.
Affinity
- Additional example of cross-scaffold QuanSA affinity prediction added to the manual.
xGen
- Updated the PyMol GUI to accomodate changes in PyMol v2.6. Added instructions for fixing PyMol MTZ loading.
Mac Platform
- Inclusion of Mac binaries for statistical utility commands. Additional instructions for Mac silicon installation.
Version 5.190 (May 2024)
Docking
- Improved RMSD calculations for evaluation of docking quality (rms_list allows specification of max number of poses to consider). Fixed a minor bug in protomol generation.
Version 5.189 (April 2024)
Tools
- Improved recognition of certain types of ring chirality that are not detectable by topology (affects the regen3d command).
- Added finer control of conformer pool size for ForceGen (-nfinal) to allow for conformer pool size control following full conformer search.
Docking
- Improved RMSD calculations for evaluation of docking quality (rms_list and rms_fam commands).
- Added a method (bbox command) to automatically define bounding boxes given collections of aligned ligands.
- Implemented a multi-core approach to pose family generation when using eSim for exploiting known poses.
Version 5.186 (April 2024)
General
- We have added a new chapter to the manual (and associated data to distribution examples) to cover Advanced Applications, which include usage of the newly released xGen PyMOL interface, docking of conformationally restrained PROTACs, and restraint-based docking and analysis of large macrocyclic peptides.
If a user has set OMP_THREAD_LIMIT, that number of threads will be used by default. - Added a user-settable environment variable SURFLEX_ROOT to aid in license file location and automatic finding of data such as the verifypdb.smi file.
Tools
- Added termination line for ForceGen and FGen3D log output to indication successful completion and number of mols processed. The command combine_sfdb will accept input arg of a single SFDB file to apply energy threshold.
- Added +3dfast option for faster 3D structure generation.
- Added commands to aid in the determination of ligand strain (see bound_energy and unbound_energy) for xGen ensembles and pose families from docking or other operations. For macrocycles, additional focus on identifying and considering different triplet H-bond sets. Improved make_sfdb to recalculate conformer energies when needed.
Similarity
- Minor changes in memory management for mult_esim. More careful checking of user-specified target molecule for esim_list.
PDBGrind
- Added option for faster grinding when the ligand details matter more than the whole complex (+pdbquick). This incorporates a simple estimate of tautomer strain in choosing which is best for a binding site.
Docking
- Minor changes to protomol core voxel method to eliminate disconnected areas. Added new methods to define binding sites during protomol generation.
Version 5.173 (August 2023)
One substantive change
- multiple ligand alignment has been improved with respect to cross-platform behavior using a normalization within pose-clique scoring. To turn off this new default behavior, in the Similarity module use -me_norm and in the QuanSA module use the -clnorm option.
One bug fix
- The docking pose family generation implementation has been patched to use the correct score values. This affects pose family results especially in cases where prior ligand knowledge is not used.
Version 5.164 (June 2023)
eSim
- Fixed memory leak when using -poscon
Tools
- Implemented new behavior for torsional restraints. Given multiple fragments that match to separate parts of a molecule to be searched, the individual fragments create positional restraints so that a linker between the two parts will keep them in geometrically correct positions. Adherence to the geometry restraint is done through the -pospen and -pwiggle parameters.
- Provide the ability to turn off inclusion of torsional restraints in SFDB conformer files. Useful if one wants to restrain conformer search but not restrain deviations in downstream calculations like docking and similarity optimization.
Version 5.162 (June 2023)
Tools Module
- Improvement in enforcement of torsional restraints during fgen3d (and regen3d) procedures.
- Fixed issue with parsing variant SDF files with non-standard tags.
- Added tautomer recognition for imide/amide proton shift.
- Increased max SDF tag line width to 2000 characters.
Docking Module
- Changed default PDB ligand parsing size to 100 heavy atoms.
- Fixed bug in reporting of polar component of docking scores.
QuanSA Module
- Fixed issue with reading named molecules where mols are repeated
ESim and xGen Modules
- No significant user-facing changes.
Version 5.142 (May 2022)
Sim and Dock Modules
- Positional constraints: Specification of a positional constraint argument (-poscon) will now cause filtering of input to similarity and docking operations. Only those input molecules that contain the given positional constraint substructure will be processed. Behavior can be suppressed with -skipnonmatch.
- SDF output: Specification of +sdf will produce a tagged SDF file as output to similarity and docking operations, where the information present in the log file (corresponding to the mol2 output) is offered in the standard MDL tag/value syntax.
- Similarity display: esim_disp will now produce a more refined display of similarity between two molecules. Load the <outprefix>-disp.pml file in PyMol to visualize.
- Molecular imprinting: molecular imprinting has been updated with eSim similarity calculations. The commands imprint, choose_ref, and iscreen_list are described in the manual.
- PSIM (see manual for details): increased thoroughness of pocket alignment, added a molecular overlap criterion to prevent marginally overlapping binding-site ligands from forming tree edges (controlled by -psim_overlap), removed brittle grid-caching behavior, added re-engineered psim_findcav command (experimental).
- PDBGrind: Better handling of peptidic ligands by identifying them as non-connected components.
- Docking fingerprints: added dock_fp command to produce protein interaction fingreprints (suitable as input to iscreen_list).
Tools Module
- Changed behavior of pretty_sdf command to also flatten molecules. Suitable for vizualization of aligned sets of ligands in pseudo-2D as well as import into ChemDraw (chirality is preserved).
- Added -molid option to specify the SDF tag name to use for molecule ID in SDF files.
- Added parse_sdf command to grab tag/value data from SDF files and produce tab-delimited text suitable for input into Excel.
- Profile command will not center molecules by default (can turn on centering with +pcenter).
Version 5.125 (January 2022)
All Modules
The default number of desired threads is 36, but the machine architecture may specify a different value, and the environment variable OMP_THREAD_LIMIT may specify a different one yet. The default thread utilization is now set to the minimum of these three values. If -nthreads is specified, that value will be used. However, if that value is higher than the machine architecture limit or the environmental thread limit, thread utilization will depend on operating system.
Tools Module
- Addition of +qmin option for approximating r-6 averaging in NMR restraints on ambiguous protons.
- Minor change in macrocycle search in order to increase search breadth of twist motions.
- The fgen3d command now allows for the -multiproc option.
- Minor changes to core utilization strategy for complex peptidic macrocycles.
- Addition of fgen_deep_list command and automatic file cleanup for fgen_deep.
- Fixed case where nominally chiral protonated N was being counted as a chiral center to be enumerated if -enum_chiral was specified. Now, Tools correctly perceives such N-H cases as being treated properly with amine inversion mechanics.
Docking Module
- Reduced memory footprint for grindpdblist and increased robustness. Windows multi-core now runs with high utilization. NOTE: with very large numbers of PDB files, the approximately 1% chance of a fatal error on any particular PDB file may manifest in a parallel run. In such cases, use the serial grinding script.
- Fixed odd case of incorrect bond order assignment to certain nitro groups in grindpdb functions.
- Protomol generation now checks for proteins being assigned partial charges. Proteins automatically charged if detected as not charged in docking commands. Better to do this right ahead of time to avoid repeated time wastage.
Similarity Module
- The -vrange option now scales the min volume against the smallest ligand of a multi-ligand query and the max volume against the largest.
- Fixed bug with -pfast option leading to incorrect MMFF94sf calculation.
- Added targprep command to validate/prepare molecular queries.
QuanSA Module
- Minor changes in multiple ligand alignment when known poses are provided via -clknown to improve utilization of the information.
- Default values for -clkthresh and -addthresh are now both 6.5.
- Fixed a bug with the add command when employing a new SFDB for the mols to be added.
xGen Module
No significant changes.
Version 5.125 (August 2021)
Tools Module
- Added fgen_deep command for extremely thorough conformer search, especially for macrocycles.
- Addition of dihedral restraint type to NMR specification to automatically determine dihedral angle sign, rather than expecting the user to make an informed guess.
- Added commands for conformer pool compression and clustering (comp_rms and comp_macrms).
- Minor changes to profile command to improve characterization of macrocyclic ring systems.
Similarity Module
- Exposed control of the joint similarity switch +-joint for multi-ligand targets. Default is +joint.
If -joint is specified, then eSim will seek to optimally match a single ligand within a multi-ligand target rather than optimizing against the best matching parts of each ligand. - Minor changes in multiple ligand alignment when known poses are provided via -me_known to improve utilization of the information.
QuanSA Module
- Minor changes in multiple ligand alignment when known poses are provided via -clknown to improve utilization of the information.
- Default values for -clkthresh and -addthresh are now both 6.5.
- Fixed a bug with the add command when employing a new SFDB for the mols to be added.
Docking Module
No significant changes.
xGen Module
No significant changes.
Version 5.114 (April 2021)
- Fixed bug in +findbeta macrocycle search that limited the number of conformations produced. A new command: comp_macrms allows for compression and clustering of macrocycle conformer pools with the clusters being driven by macrocyclic backbone geometry.
- Fixed bug in output of two-way eSim values to the log file. Slight changes to calculation of two-way eSim values to increase accuracy in higher- throughput modes (with a time cost for the +two_way options).
- Observer points are now placed around a canonically oriented target ligand prior to transformation back to the original coordinate frame. This reduces the relatively minor effects of coordinate frame changes on eSim calculations.
- eSim reference ligands that have zero partial charges will be automatically charged. Note that reference ligands that have (non-SF) partial charges should be explicitly assigned SF charges (e.g. by using the “charge” command of sf-tools). Similarly, molecules provided as known poses will also be automatically charged if existing charges are zero.
- The default value for -me_kthresh has been decreased to 4.0. This controls the level of similarity required for a molecule that is part of a mult_esim operation to be retained as part of the multiple alignment when making use of the -me_known option. The -me_known option allows a user to specify, for example, a set of crystallographic poses to drive pose generation. Final pose selection optimizes a multi-objective function that seeks high mutual similarity and low ligand strain. Molecules that are less similar than -me_kthresh to the -me_given mols are omitted from the output. Users can align them to the molecules that “survived” through an esim_list operation if desired.
Cerella release notes
Version 1.1.17 (October 2024)
New features
Web UI
- Added the option for a user to choose between bespoke and global hyperparameter optimisation
Changes
Data upload and model building
- Increased the default limit on the number of input columns expected
- Decoupled the creation of database indices from the input columns configuration settings
- Increased the default memory limits for the query interface pods to reduce the likelihood of bottlenecks
- Added checks to ensure that the number of endpoints being uploaded is within the maximum limits set
Web UI
- The query interface file upload logs are now available within the System Logs
- Increased the level of feedback provided when a user has problems loading data
- Removed misleading warnings indicating problems with model building that appear during deployment before data have been uploaded
- Modified the UI and added help information to improve the ease of use of the Model Comparison functionality
- Updated to the latest version of Node.js
Bug fixes
Data upload and model building
- The data source ID value in a data source JSON configuration file was not being retained when a manual upload was run, resulting in a new data source ID being added automatically to the uploaded data source
API
- Fixed the handling of qualified outlier calculations where there is no associated measured error value which caused the suggested measurements functionality to fail occasionally
- Added support for “%” in column names so that the symbol doesn’t break SQL-parameterised queries when the “%” character is used as a placeholder
Version 1.1.16 (August 2024)
New Features
Web UI
- You can indicate a column in your data source to be used to define which compounds are part of the test set. In this column, the word “Test” against a compound will indicate that it should be used as part of the independent test set
- Where an endpoint is listed in one of the data source review reports, it is now clickable so that it becomes selected in the data source editor
- A Cerella enabled column has been added to the data source summary panel
Changes
Data upload and model building
- The outlier detection method has been improved. Outlier detection involves imputing endpoint values for data points where measured values exist. To ensure that the data points for a compound are not used to generate the model being used to carry out outlier detection, multiple subsets are created from the training data set such that each compound is absent from one of the subsets. These subsets are used to train sub-models, which are then used to impute endpoint values, distributions and outliers for the compounds that are not included.
- Changed default retention policy to ensure that ingest cleanup jobs and their corresponding pods are not retained after successful completion
Web UI
- The endpoint name is now included in the rollback plot title
API
- Added new timestamp keys to the Cerella Statistics server for Sub-Model Training, Sub-Model Prediction, Outliers Sub-Model Outliers, and Sub-Model Probability Distributions
Bug fixes
Data upload and model building
- Fixed the virtual data matrix cache initialisation to ensure it has sufficient capacity
- Fixed a bug that resulted in the Web UI not reporting any hyperparameter optimisation progress until the process had finished
Web UI
- Fixed the filename of the rollback plot image so that it now includes the endpoint name
API
- Fixed the ordering of virtual predictions in the Cerella Engine
- Fixed a problem with inconsistent probability distributions for outliers by storing the distributions in the data matrix
- Addressed connection issues between the Cerella Statistics server and the Structure Tracker by adding retry capabilities with back-off functionality
- Addressed connection issues between the Cerella Statistics server and the Endpoint Tracker by adding retry capabilities with back-off functionality
Version 1.1.15 (January 2024)
New Features
Web UI
- The name of any uploaded test set identifier file for a data source is displayed
- All visible users and data sources can be selected or deselected with a single click
Changes
Data upload and model building
- The storage server has been replaced by an AWS EFS volume mount
- Updated to Alchemite v0.82.0
Web UI
- Added a vertical scroll bar to the Model Comparison table
Bug Fixes
Data upload and model building
- Fixed an issue with missing permissions for a model comparison log query
- Fixed an issue with permissions to allow obsolete endpoints to be deleted during a scheduled data upload
- Updated to Alchemite v0.82.0
- Fixed a model server timeout on startup
- Fixed model server blocking requests
- Fixed an issue causing the hyperparameter optimisation to miss cycles
Web UI
- Fixed a failure to show transformations configured as None correctly when a data source was created from a JSON data source configuration file
Version 1.1.14 (December 2023)
New Features
Data upload and model-building
- Added the ability to upload a CSV file of compounds and endpoint data in order to make predictions using the impute, virtual and selected models for comparison. The endpoints in the CSV file should match those in the current Cerella models
Data source configuration
- Added the option to ignore qualifiers on values for an individual endpoint in data upload and model building
Web UI
- Added a password reveal feature
- Added the ability to see the status of custom data sources
Changes
Data upload and model-building
- Changed the default p-value for rejecting low-confidence predictions to 0.3
Web UI
- Added an inactivity timeout of 15 minutes
Bug Fixes
Web UI
- Fixed the Update button in the System Management pane that was not visible when browser scaling was applied
- Fixed a copy to clipboard bug on Firefox
Version 1.1.13 (October 2023)
New Features
Data upload and model-building
- Endpoints with low variance are listed in the data upload report
- Highly correlated endpoints are listed in the data upload report
- Multiple scheduled data upload and model-building jobs can be defined with different model-building options
Web UI
- Rollback data for all endpoints can be downloaded in a single file
Changes
Data upload and model-building
- If a chemical structure represents more than one discrete component (e.g., a mixture), only the largest is used in the descriptor calculator
Web UI
- In the data source configuration, the Use factor checkbox has been replaced with an Error type selector
- The performance of endpoint rollback display and download for individual endpoints has been improved
Bug Fixes
Data upload and model-building
- Fixed a bug that resulted in an out-of-bounds error in endpoint rollback calculation
- Fixed a bug that resulted in a JSON parse failure when retrieving query results in the data upload pipeline
Web UI
- Removed unnecessary error messages that were shown when the user did not have permission to access certain controls
REST API
- Fixed a bug that resulted in model services returning error status in a clean Cerella installation
- Fixed a bug that could result in Cerella values not being retrieved in a query
- Fixed a bug that stopped suggested measurements being returned if there was a duplicate endpoint in the request
Version 1.1.12 (September 2023)
New Features
Data upload and model-building
- Endpoint data for duplicate compounds are now merged using a default or user- defined merge rule (previously, data was taken from one compound)
- The minimum and maximum number of iteration layers may be specified for the impute and virtual models in hyperparameter optimisation
Data source configuration
- A merge rule for duplicate compounds can now specified for each endpoint
Changes
Data upload and model-building
- Updated to Alchemite v0.71.1 and binary version 20230728
- Synonyms are now extracted from CDD Vault when creating a data source
- The CDD Vault integration has been updated to use the get readout rows API endpoint
- The data upload process is now run in a Kubernetes job
- Predicted values from previous model-building runs are removed from the data matrix if the compound is not included in the uploaded data source(s)
- The Upload data, training and prediction option now uses the current user-defined test set (if this has been changed since hyperparameter optimisation was last run)
Web UI
- Compounds and endpoints are now reported for each data source in the data upload report
REST API
- Queries with the unsupported NOT EXISTS operator result in a clear error message
- The performance of query interface data file upload has been improved
- The query interface data source status endpoint now returns the data source name and type
Bug Fixes
Data upload and model-building
- Fixed a bug that resulted in the data upload report generation failing on an invalid structure
- Fixed a bug that resulted in the structure tracker failing to retrieve the correct compounds if client IDs contained hyphens
Web UI
- Fixed bug that resulted in the data upload report view failing to be updated if data upload was re-run
REST API
- Fixed a problem with inconsistent factor errors of x0 and x1 for a Cerella-transformed endpoint
Version 1.1.11 (July 2023)
New Features
Data upload and model-building
- Added a new data upload report, summarising the data sources, compounds and endpoints uploaded for model building
Web UI
- Added a new System Logs panel, allowing logs for data upload and model-building processes to be viewed and downloaded for a specified date and time range
- Added the ability for the full matrix of measured and predicted values from the impute, virtual and selected models to be downloaded for the training and test set
Changes
Data upload and model-building
- The data upload pipeline is now more resilient and retries REST API requests that fail
- The data set split for hyperparameter optimisation is now run as a Kubernetes job rather than on the ingest-upload server to increase available memory
Web UI
- Removed old log download links from the System Status panel
Bug Fixes
Data upload and model-building
- Fixed a bug stopping the data source status API endpoint from including custom data sources
Web UI
- Fixed a bug that stopped data source checkboxes from appearing on the Add User panel
- Fixed a bug that resulted in values being read from the wrong file for virtual and selected individual endpoint results
REST API
- Fixed a bug that stopped Cerella prediction requests from returning results for valid compounds if there was a single invalid structure within the batch
- Fixed a bug that resulted in missing predictions when measured input values were 0
Version 1.1.9 (April 2023)
New Features
Data upload and model building:
- Updated to Alchemite v0.61.1
Cerella API:
- Updated to Alchemite v0.61.1
Web UI:
- Added a mailto link for Cerella support
Changes
Data upload and model building:
- Refactored the virtual model validation process to use Alchemite’s analyse_validate job with the virtualExperimentValidation option
- Tidied information reported in data upload and model building logging
- Modified rules to apply log10 transformations to certain bounded percentages by default
Data source configuration:
- Added molecule name and cdd_registry_number to CDD Vault data source properties
Cerella API:
- Tidied information reported in server logging
- Added optimistic concurrency control for custom descriptors endpoints
Web UI:
- Update copyright statements
Bug Fixes
Data upload and model building:
- Fixed a data upload and prediction failure encountered when a new data set had additional columns not included in the original model training
- Fixed an inconsistency in median R² values from internal validation (hyperparameter optimisation). Alchemite now combines the predictions for each fold and then computes R², instead of computing the R² for each fold individually and then averaging
- Tidied logging rules to ensure that identifiers of invalid structures are not written to the data upload log
- Tidied logging rules to ensure that endpoint group and display group information is not written to the data upload log
Web UI:
- Removed unnecessary whitespace displayed in the importance matrix plot
Version 1.1.10 (May 2023)
New Features
Data upload and model building:
- Added support for input-only endpoints.
- Enabled the hyperparameter validation combination method to be set to mean rather than median, which remains the default.
Data source configuration:
- Added support for input-only endpoints
Changes
Web UI:
- Users can no longer delete their own UserAdmin permissions or change their own Active status
- The data source file name is now displayed for flat-file data sources
Bug Fixes
Web UI:
- The endpoint rollback axis label is now included in any saved image
- The axis label wording on the endpoint rollback plot has been improved
Version 1.1.8 (March 2023)
New Features
Data upload and model building:
- Added support for specifying priority endpoints in hyperparameter optimisation
- Enabled administrators to specify and store data upload and model building configuration
- Enabled imputation and virtual model hyperparameter optimisation processes to run concurrently
Changes
Data upload and model building:
- Modified the wording for the data upload and model building steps in the administration interface
- Improved exception traceback logging in the data upload and model building pipeline
- Provided access to separate imputation and virtual hyperparameter optimisation logs
- Added timings for model training, validation and prediction in the administration interface
- Updated the presentation of data upload and model building logs and timings in the administration interface
- Excluded endpoints without any validation results from validation reports
- Added help text for model building and prediction options in the administration interface
Data source configuration:
- Added help text for the ‘Hidden’ checkbox in the data source configuration editor
- Added details of transformations to the units report for data source endpoints
Cerella API:
- Changed the location from which the predicted value rejection threshold is loaded to be the configuration server rather than the environment
Bug Fixes
Data upload and model building:
- Fixed the behaviour of endpoints with an empty measurement group; each endpoint is now placed in a separate endpoint group
- If the data upload step fails, the pipeline now terminates, rather than running any subsequent model building steps
- Fixed model validation failures that occurred due to the virtual test values file persisting between model building runs
- Ensured that Cerella no longer treats an empty string as a category value in data upload
- A meaningful error message is returned instead of ‘internal server error’ when returning model statistics if the Cerella engine is unavailable
Data source configuration:
- Fixed a problem with transformation editing for numeric data source endpoints
- Fixed an issue that meant the endpoint group could not be changed from ‘Mixed’ when multiple endpoints were selected
Cerella API:
- Fixed a failure to return predicted Cerella values (with ‘internal server error’) if the request did not include measured values