Introduction: QSPR modelling is one of the major computational tools used to correlate molecular characteristics with physiochemical properties of molecules. In present work, QSPR models are formed using AIC and VIF multicollinearity indicators for descriptors selection taking solubility data of Paclitaxel prodrugs. Geometry optimization of these Paclitaxel prodrugs was performed at the PM6 and AM1levels using Gaussian software.
Methods: Four descriptor groups such as 2D Autocorrelation, CATS_3D, WHIM, GETAWAY provided initial QSPR models with moderate accuracy for both the optimized geometry datasets. The descriptors from two descriptor-groups which were showing reasonable correlation (Q2) were combined to form improved models. The selection of descriptors was performed in multiple steps to determine optimal models which contain five and four descriptors for PM6 and AM1 optimized geometry datasets respectively. The R2 & Q2 values are 0.86 & 0.83 and 0.87 & 0.86 for PM6 and AM1 geometries respectively.
Results: The models formed shows comparable results with the earlier reported results. The proposed protocol is also implemented on Huuskonen small dataset and the final QSPR model contains only two descriptors. On this smaller dataset, the QSPR model gives the R2 and Q2 values 0.87 and 0.85, respectively which is comparable to the results of Paclitaxel prodrugs.
Conclusion: Our approach can be applicable to different datasets and it can assist the synthesis of molecule with better solubility. These QSPR models can be implemented for predicting the aqueous solubility of unknown Paclitaxel prodrugs.