Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made
it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors
for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle
progression and control. Such drugs have potential anticancer activities.
Objective: Our goal here is to review recent applications of machine learning methods to predict ligand-
binding affinity for protein targets. To assess the predictive performance of classical scoring
functions and targeted scoring functions, we focused our analysis on CDK2 structures.
Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with
different ligands, many of them with inhibition constant information. We investigate here computational
methods to calculate the binding affinity of CDK2 through classical scoring functions and machine-
Results: Analysis of the predictive performance of classical scoring functions available in docking
programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these
methods failed to predict binding affinity with significant correlation with experimental data. Targeted
scoring functions developed through supervised machine learning techniques showed a significant
correlation with experimental data.
Conclusion: Here, we described the application of supervised machine learning techniques to generate
a scoring function to predict binding affinity. Machine learning models showed superior predictive
performance when compared with classical scoring functions. Analysis of the computational models
obtained through machine learning could capture essential structural features responsible for binding
affinity against CDK2.