Introduction: QSPR modelling is one of the major computational tools used to correlate
molecular characteristics with physiochemical properties of molecules. In present work, QSPR models
are formed using AIC and VIF multicollinearity indicators for descriptors selection taking solubility
data of Paclitaxel prodrugs. Geometry optimization of these Paclitaxel prodrugs was performed at the
PM6 and AM1levels using Gaussian software.
Methods: Four descriptor groups such as 2D Autocorrelation, CATS_3D, WHIM, GETAWAY provided
initial QSPR models with moderate accuracy for both the optimized geometry datasets. The descriptors
from two descriptor-groups which were showing reasonable correlation (Q2) were combined
to form improved models. The selection of descriptors was performed in multiple steps to determine
optimal models which contain five and four descriptors for PM6 and AM1 optimized geometry datasets
respectively. The R2 & Q2 values are 0.86 & 0.83 and 0.87 & 0.86 for PM6 and AM1 geometries
Results: The models formed shows comparable results with the earlier reported results. The proposed
protocol is also implemented on Huuskonen small dataset and the final QSPR model contains only two
descriptors. On this smaller dataset, the QSPR model gives the R2 and Q2 values 0.87 and 0.85, respectively
which is comparable to the results of Paclitaxel prodrugs.
Conclusion: Our approach can be applicable to different datasets and it can assist the synthesis of molecule
with better solubility. These QSPR models can be implemented for predicting the aqueous
solubility of unknown Paclitaxel prodrugs.