The Prediction of Human Intestinal Absorption Based on the Molecular Structure
J. Vicente de Julian-Ortiz, Riccardo Zanni, Maria Galvez-Llompart and Ramon Garcia-Domenech
Affiliation: Molecular Connectivity and Drug Design Research Unit, Department of Physical Chemistry, Faculty of Pharmacy, University of Valencia, Av. V. Andres Estelles 0, 46100 Burjassot, Valencia, Spain.
Keywords: Artificial neural networks, human intestinal absorption, molecular topology, pattern recognition, QSAR.
Human Intestinal Absorption (HIA) has been modeled many times by using classification models. However, regression models
are scarce. Here, Artificial Neural Networks (ANNs) are implemented for this purpose. A dataset of structurally diverse chemicals with
their respective experimental HIA were used to design robust, true predictive and widespread applicable ANN models. An input variables
pool was made up of structural invariants calculated by using either Dragon or our software Desmol 1. The selection of best variables
was performed following three steps using the entire dataset of molecules. Firstly, variables poorly correlated with the experimental data
were eliminated. Secondly, input variable selection was performed by stepwise multilinear regression. Thirdly, correlation matrix in the
set of selected variables was then obtained to eliminate those variables strongly intercorrelated. Backpropagation ANNs were trained for
these variables finally selected as inputs, and HIA as output. The training and selection procedure to find robust models consisted of randomly
partitioning the dataset into three sets: training set, with 50% of the population, test set with 25%, and validation set with the other
25%. With each partitioning, diverse numbers of hidden nodes were assayed to optimize the performance in the prediction for the three
sets. Models with r2 greater than 0.6 for the three sets were considered as robust. A randomization test following all these steps was performed,
and the poor results obtained confirm the validity of the method presented in this paper to predict HIA for datasets of structurally
diverse organic compounds.
Rights & PermissionsPrintExport