Evolutionary Computation and QSAR Research
Juan R. Rabunal,
Cristian R. Munteanu.
The successful high throughput screening of molecule libraries for a specific biological property is one of the
main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative
structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with
molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced
(clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor
pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model,
scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial
intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors,
selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that
correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial
intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation
methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as
evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build
QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the
joint or multi-task feature selection methods.
Keywords: Evolutionary computation, feature extraction, genetic algorithms, genetic programming, molecular descriptors,
quantitative structure-activity relationships, QSAR, variable selection.
Rights & PermissionsPrintExport