Background: HIV-1 Integrase (IN) is an important target for the development of the
new anti-AIDS drugs. HIV-1 LEDGF/p75 inhibitors, which block the integrase and LEDGF/p75
interaction, have been validated for reduction in HIV-1 viral replicative capacity.
Methods: In this work, computational Quantitative Structure-Activity Relationship (QSAR) models
were developed for predicting the bioactivity of HIV-1 integrase LEDGF/p75 inhibitors. We collected
190 inhibitors and their bioactivities in this study and divided the inhibitors into nine scaffolds
by the method of T-distributed Stochastic Neighbor Embedding (TSNE). These 190 inhibitors
were split into a training set and a test set according to the result of a Kohonen’s self-organizing
map (SOM) or randomly. Multiple Linear Regression (MLR) models, support vector machine
(SVM) models and two consensus models were built based on the training sets by 20 selected
CORINA Symphony descriptors.
Results: All the models showed a good prediction of pIC50. The correlation coefficients of all the
models were more than 0.7 on the test set. For the training set of consensus Model C1, which performed
better than other models, the correlation coefficient(r) achieved 0.909 on the training set,
and 0.804 on the test set.
Conclusion: The selected molecular descriptors show that hydrogen bond acceptor, atom charges
and electronegativities (especially π atom) were important in predicting the activity of HIV-1 integrase