This work reports a detailed study of the ability of linear and non-linear classification
methods to estimate the estrogenic activities of a series of 55 natural estrogen-like isoflavonoid and
diphenolic compounds. In doing so, we examined the use of linear discriminant analysis (LDA) and
nonlinear support vector machines (SVMs) techniques along with feature selection algorithms. The
structural characteristics of each of the studied compounds were calculated from the optimized
molecular geometries. Both the LDA and SVMs models contain four descriptors, however, the SVMs model (total
accuracy 89.1%) was found to be superior to the LDA model (total accuracy 80.0%). The analysis of molecular
descriptors within our models provided essential insights towards a better understanding of the estrogenic mechanisms of
natural estrogen-like phytoestrogens. Furthermore, the derived models can be applied in the future screening of other
natural estrogen-like compounds.
Keywords: Classification, QSAR, linear discriminant analysis, isoflavonoids and diphenolics, support vector machines.
Rights & PermissionsPrintExport