Epidermal Growth Factor Receptor (EGFR) is a high priority target in anticancer drug research. Thousands of very effective EGFR inhibitors have been developed in the last decade. The known inhibitors are originated from a very diverse chemical space but - without exception - all of them act at the Adenosine TriPhosphate (ATP) binding site of the enzyme. We have collected all of the diverse inhibitor structures and the relevant biological data obtained from comparable assays and built prediction oriented Quantitative Structure- Activity Relationship (QSAR) which models the ATP binding pockets interactive surface from the ligand side. We describe a QSAR method with automatic Variable Subset Selection (VSS) by Genetic Algorithm (GA) and goodness-of-prediction driven QSAR model building, resulting an externally validated EGFR inhibitory model built from pIC50 values of a diverse structural set of 623 EGFR inhibitors. Repeated Trainings/Evaluations (RTE) were used to obtain model fitness values and the effectiveness of VSS is amplified by using predictive ability scores of descriptors. Numerous models were generated by different methods and viable models were collected. Then, intensive RTE were applied to identify ultimate models for external validations. Finally, suitable models were validated by statistical tests. Since we use calculated molecular descriptors in the modeling, these models are suitable for virtual screening for obtaining novel potential EGFR inhibitors.
Keywords: EGFR inhibitors, Prediction oriented QSAR, Diverse compound set, External validation, Repeated Trainings/Evaluations, Virtual screening
Rights & PermissionsPrintExport