N-Linear Algebraic Maps for Chemical Structure Codification: A Suitable Generalization for Atom-pair Approaches?
Cesar R. Garcia-Jacas,
Stephen J. Barigye,
Jose R. Valdes-Martini,
Oscar M. Rivera-Borroto,
The present manuscript introduces, for the first time, a novel 3D-QSAR alignment free method (QuBiLS-MIDAS) based on
tensor concepts through the use of the three-linear and four-linear algebraic forms as specific cases of n-linear maps. To this end, the kth
three-tuple and four-tuple spatial-(dis)similarity matrices are defined, as tensors of order 3 and 4, respectively, to represent 3Dinformation
among “three and four” atoms of the molecular structures. Several measures (multi-metrics) to establish (dis)-similarity relations
among “three and four” atoms are discussed, as well as, normalization schemes proposed for the n-tuple spatial-(dis)similarity
matrices based on the simple-stochastic and mutual probability algebraic transformations. To consider specific interactions among atoms,
both for the global and local indices, n-tuple path and length cut-off constraints are introduced. This algebraic scaffold can also be seen as
a generalization of the vector-matrix-vector multiplication procedure (which is a matrix representation of the traditional linear, quadratic
and bilinear forms) for the calculation of molecular descriptors and is thus a new theoretical approach with a methodological contribution.
A variability analysis based on Shannon’s entropy reveals that the best distributions are achieved with the ternary and quaternary
measures corresponding to the bond and dihedral angles. In addition, the proposed indices have superior entropy behavior than the descriptors
calculated by other programs used in chemo-informatics studies, such as, DRAGON, PADEL, Mold2, and so on. A principal
component analysis shows that the novel 3D n-tuple indices codify the same information captured by the DRAGON 3D-indices, as well
as, information not codified by the latter. A QSAR study to obtain deeper criteria on the contribution of the novel molecular parameters
was performed for the binding affinity to the corticosteroid-binding globulin, using Cramer’s steroid database. The achieved results reveal
superior statistical parameters for the Bond Angle and Dihedral Angle approaches, consistent with the results obtained in variability
analysis. Finally, the obtained QuBiLS-MIDAS models yield superior performances than all 3D-QSAR methods reported in the literature
using the 31 steroids as training set, and for the popular division of Cramer’s database in training (1-21) and test (22-31) sets, comparable
to superior results in the prediction of the activity of the steroids are obtained. From the results achieved, it can be suggested that the proposed
QuBiLS-MIDAS N-tuples indices are a useful tool to be considered in chemo-informatics studies.
Keywords: 3D Three-linear and four-linear indices, aggregation operator, Cramer’s steroid, N-tuple simple stochastic and mutual probability
matrices, N-tuple spatial-(Dis)similarity matrix, principal component analysis, QuBiLS-MIDAS N-tuples, QSAR, shannon entropy, TOMOCOMD-
CARDD, variability analysis.
Rights & PermissionsPrintExport