Present work deals with generation of virtual samples as mathematical modeling of empirical data on the basis
of empirical data. The generated samples were used for development of QSAR model. The method deals with
extrapolation of sample vector in such a manner that there is conservation of the empirical data distribution. The data
distribution has been judged with statistical parameters. The method was implemented with anticancer activity of
Gossypol acetic acid against BCL2 target for colorectal cancer. Considering the virtual samples only for model
development, model training showed a regression coefficient for leave one out cross validation as 0.996 with 66 virtual
samples, and a regression coefficient with external test set data (51 samples) as 0.993. External test set data which were
never used in the virtual sample generation showed predicted regression coefficient value of >0.61. On the basis of QSAR
model, nine compounds were suggested as anti-BCL2 active compounds. The suggested compounds were further
validated by docking study with Gossypol acetic acid and ‘Tetrahydroisoquinoline amide substituted phenyl pyrazole’ cocrystallized
with chimeric BCL2-XL (PDBID: 2W3L) protein.
Keywords: BCL2, cancer, QSAR, SVR, virtual screening.
Rights & PermissionsPrintExport