One can distinguish between two kinds of virtual combinatorial libraries: ”viable“ and ”accessible“. Viable libraries are relatively small in size, are assembled from readily available reagents that have been filtered by the medicinal chemist, and often have a physical counterpart. Conversely, accessible libraries can encompass millions or billions of structures, typically include all possible reagents that are in principle compatible with a particular reaction scheme, and they can never be physically synthesized in their entirety. Although the analysis of viable virtual libraries is relatively straightforward, the handling of large accessible libraries requires methods that scale well with respect to library size. In this work, we present novel, efficient and scalable techniques for the construction, analysis, and in silico screening of massive virtual combinatorial libraries.
Keywords: Virtual combinatorial libraries, combinatorial chemistry, high-throughput screening, compound selection, library design, molecular diversity, molecular similarity, qsar, nonliner mapping, multidimensional scaling