Background: Many techniques to design chemical libraries for screening have been put forward over time. General use libraries are still important when screening against novel targets, and their design has relied on the use of molecular descriptors. In contrast, chemotype or scaffold analysis has been used less often.
Objective: We describe a simple method to assess chemical diversity based on counts of the chemotypes that offers an alternative to model chemical diversity. We describe a simple method to assess chemical diversity based on counts of the chemotypes that offers an alternative to model chemical diversity based on computed molecular properties. We show how chemotype counts can be used to evaluate the diversity of a library and compare diversity selection algorithms. We demonstrate an efficient compound selection algorithm based on chemotype analysis.
Methods: We use automated chemotype perception algorithms and compare them to traditional techniques for diversity analysis to check their effectiveness in designing diverse libraries for screening.
Results: The best type of molecular fingerprints for diversity selection in our analysis are extended circular fingerprints, but they can be outperformed by the use of a chemotype diversity algorithm, which can be more intuitive than traditional techniques based on molecular descriptors. Chemotype- -based algorithms retrieve a larger share of the chemotypes contained in a library when picking a subset of the chemicals in a collection.
Conclusions: Chemotype analysis offers an alternative for the generation of a general-purpose screening library as it maximizes the number of chemotypes present in a subset with the smallest number of compounds. The applications of methods based on chemotype analysis that does not resort to the use of molecular descriptors are a very promising but seldom explored area of chemoinformatics.