This website stores cookies on your computer. These cookies collect information about how you interact with our website and allow us to remember you. We use this information to improve and customise your browsing experience and for analytics and metrics about our visitors on this website and other media. To find out more about the cookies we use, see our Privacy Policy.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference not to be tracked.

What are the methods to cluster my compounds in StarDrop?

StarDrop

In StarDrop, you can choose to cluster your data based on common substructure, chemical structure or property values. Structure and property-based clustering use the ‘dbclus’ algorithm developed by Butina et al. (J. Chem. Inf. Comput. Sci. 1999, 39, 4, 747–750) and differ only in the way the similarity between compounds is measured.

In the case of clustering by structure, a Tanimoto index is used to compare molecules; when you choose to cluster by properties, the comparison is based on the Euclidean distance between compounds using the properties that you have selected.

Clustering based on the common substructure uses a maximum common substructure algorithm to group compounds containing a significant common substructure. To speed this up, a Tanimoto similarity is used to identify compounds that will not have a substantial common substructure before performing a full comparison based on the chemical graph.

More details about clustering methods can be found in Section 5.1 of the StarDrop Reference Guide, which you access from the Help menu.

Cookies

More StarDrop resources

Clustering in Card View

Breaking free from chemical spreadsheets

Trellising in StarDrop