Machine Learning in Quantitative Protein–peptide Affinity Prediction: Implications for Therapeutic Peptide Design

Author(s): Zhongyan Li, Qingqing Miao, Fugang Yan, Yang Meng, Peng Zhou*.

Journal Name: Current Drug Metabolism

Volume 20 , Issue 3 , 2019

Submit Manuscript
Submit Proposal

Graphical Abstract:


Background: Protein–peptide recognition plays an essential role in the orchestration and regulation of cell signaling networks, which is estimated to be responsible for up to 40% of biological interaction events in the human interactome and has recently been recognized as a new and attractive druggable target for drug development and disease intervention.

Methods: We present a systematic review on the application of machine learning techniques in the quantitative modeling and prediction of protein–peptide binding affinity, particularly focusing on its implications for therapeutic peptide design. We also briefly introduce the physical quantities used to characterize protein–peptide affinity and attempt to extend the content of generalized machine learning methods.

Results: Existing issues and future perspective on the statistical modeling and regression prediction of protein– peptide binding affinity are discussed.

Conclusion: There is still a long way to go before establishment of general, reliable and efficient machine leaningbased protein–peptide affinity predictors.

Keywords: Protein-peptide affinity, therapeutic peptide design, machine learning, statistical regression, druggable target, molecular recognition, computational peptidology.

Gingras, A.C.; Gstaiger, M.; Raught, B.; Aebersold, R. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol., 2007, 8, 645-654.
Salazar, C.; Höfer, T. Versatile regulation of multisite protein phosphorylation by the order of phosphate processing and protein-protein interactions. FEBS J., 2007, 274, 1046-1061.
Petsalaki, E.; Stark, A.; García-Urdiales, E.; Russell, R.B. Accurate prediction of peptide binding sites on protein surfaces. PLOS Comput. Biol., 2009, 5, e1000335.
Neduva, V.; Russell, R.B. Peptides mediating interaction networks: New leads at last. Curr. Opin. Biotechnol., 2006, 17, 465-471.
Petsalaki, E.; Russell, R.B. Peptide-mediated interactions in biological systems: New discoveries and applications. Curr. Opin. Biotechnol., 2008, 19, 344-350.
Rubinstein, M.; Niv, M.Y. Peptidic modulators of protein-protein interactions: Progress and challenges in computational design. Biopolymers, 2009, 91, 505-513.
Corbi-Verge, C.; Kim, P.M. Motif mediated protein-protein interactions as drug targets. Cell Commun. Signal., 2016, 14, 8.
Chen, T.S.; Petrey, D.; Garzon, J.I.; Honig, B. Predicting peptide-mediated interactions on a genome-wide scale. PLOS Comput. Biol., 2015, 11, e1004248.
Vanhee, P.; Van Der Sloot, A.M.; Verschueren, E.; Serrano, L.; Rousseau, F.; Schymkowitz, J. Computational design of peptide ligands. Trends Biotechnol., 2011, 29, 231-239.
Audie, J.; Swanson, J. Advances in the prediction of protein-peptide binding affinities: Implications for peptide-based drug discovery. Chem. Biol. Drug Des., 2013, 81, 50-60.
Zhou, P.; Wang, C.; Ren, Y.; Yang, C.; Tian, F. Computational peptidology: A new and promising approach to therapeutic peptide design. Curr. Med. Chem., 2013, 20, 1985-1996.
Homeyer, N.; Gohlke, H. Free energy calculations by the molecular mechanics Poisson-Boltzmann surface area method. Mol. Inform., 2012, 31, 114-122.
Zhang, C.; Liu, S.; Zhu, Q.; Zhou, Y. A knowledge-based energy function for protein-ligand, protein-protein, and protein-DNA complexes. J. Med. Chem., 2005, 48, 2325-2335.
Reimand, J.; Hui, S.; Jain, S.; Law, B.; Bader, G.D. Domain-mediated protein interaction prediction: From genome to network. FEBS Lett., 2012, 586, 2751-2763.
Pierce, M.M.; Raman, C.S.; Nall, B.T. Isothermal titration calorimetry of protein-protein interactions. Methods, 1999, 19, 213-221.
Yu, H.; Zhou, P.; Deng, M.; Shang, Z. Indirect readout in protein-peptide recognition: A different story from classical biomolecular recognition. J. Chem. Inf. Model., 2014, 54, 2022-2032.
Moerke, N.J. Fluorescence Polarization (FP) assays for monitoring peptide-protein or nucleic acid-protein binding. Curr. Protoc. Chem. Biol., 2009, 1, 1-15.
Spiga, O.; Bernini, A.; Scarselli, M.; Ciutti, A.; Bracci, L.; Lozzi, L.; Lelli, B.; Di Maro, D.; Calamandrei, D.; Niccolai, N. Peptide-protein interactions studied by surface plasmon and nuclear magnetic resonances. FEBS Lett., 2002, 511, 33-35.
Köhler, C.; Recht, R.; Quinternet, M.; De Lamotte, F.; Delsuc, M.A.; Kieffer, B. Accurate protein-peptide titration experiments by nuclear magnetic resonance using low-volume samples. Methods Mol. Biol., 2015, 1286, 279-296.
Weng, Z.; Zhao, Q. Utilizing ELISA to monitor protein-protein interaction. Methods Mol. Biol., 2015, 1278, 341-352.
Rossi, G.; Real-Fernández, F.; Panza, F.; Barbetti, F.; Pratesi, F.; Rovero, P.; Migliorini, P. Biosensor analysis of anti-citrullinated protein/peptide antibody affinity. Anal. Biochem., 2014, 465, 96-101.
Alexopoulos, E.C. Introduction to multivariate regression analysis. Hippokratia, 2010, 14(Suppl. 1), 23-28.
Wold, S.; Sjöströma, M.; Erikssonb, L. PLS-regression: A basic tool of chemometrics. Chemom. Intell. Lab. Syst., 2001, 58, 109-130.
Wesolowski, M.; Suchacz, B. Artificial neural networks: Theoretical background and pharmaceutical applications: A review. J. AOAC Int., 2012, 95, 652-668.
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn., 1995, 20, 273-297.
Breiman, L. Random forests. Mach. Learn., 2001, 45, 5-32.
Rasmussen, C.E.; Williams, C.K.I. Gaussian processes for machine learning; MIT Press, 2006.
Collantes, E.R.; Dunn, W.J. Amino acid side chain descriptors for quantitative structure-activity relationship studies of peptide analogues. J. Med. Chem., 1995, 38, 2705-2713.
Mei, H.; Liao, Z.H.; Zhou, Y.; Li, S.Z. A new set of amino acid descriptors and its application in peptide QSARs. Biopolymers, 2005, 80, 775-786.
Doytchinova, I.A.; Walshe, V.; Borrow, P.; Flower, D.R. Towards the chemometric dissection of peptide - HLA-A*0201 binding affinity: comparison of local and global QSAR models. J. Comput. Aided Mol. Des., 2005, 19, 203-212.
Wold, S.; Jonsson, J.; Sjörström, M.; Sandberg, M.; Rännar, S. DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal. Chim. Acta, 1993, 277, 239-253.
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The protein data bank. Nucleic Acids Res., 2000, 28, 235-242.
Vanhee, P.; Reumers, J.; Stricher, F.; Baeten, L.; Serrano, L.; Schymkowitz, J.; Rousseau, F.; Pep, X. A structural database of non-redundant protein-peptide complexes. Nucleic Acids Res., 2010, 38, D545-D5451.
Zhou, Y.; Ni, Z.; Chen, K.; Liu, H.; Chen, L.; Lian, C.; Yan, L. Modeling protein-peptide recognition based on classical quantitative structure-affinity relationship approach: Implication for proteome-wide inference of peptide-mediated interactions. Protein J., 2013, 32, 568-578.
Han, K.; Wu, G.; Lv, F. Development of QSAR-improved statistical potential for the structure-based analysis of protein-peptide binding affinities. Mol. Inform., 2013, 32, 783-792.
Roux, B. The calculation of the potential of mean force using computer simulations. Comput. Phys. Commun., 1995, 91, 275-282.
Cherkasov, A.; Muratov, E.N.; Fourches, D.; Varnek, A.; Baskin, I.I.; Cronin, M.; Dearden, J.; Gramatica, P.; Martin, Y.C.; Todeschini, R.; Consonni, V.; Kuz’min, V.E.; Cramer, R.; Benigni, R.; Yang, C.; Rathman, J.; Terfloth, L.; Gasteiger, J.; Richard, A.; Tropsha, A. QSAR modeling: Where have you been? Where are you going to? J. Med. Chem., 2014, 57, 4977-5010.
Chen, D.; Liu, S.; Zhang, W.; Sun, L. Rational design of YAP WW1 domain-binding peptides to target TGFβ/BMP/Smad-YAP interaction in heterotopic ossification. J. Pept. Sci., 2015, 21, 826-832.
Fang, Y.; Jin, R.; Gao, Y.; Gao, J.; Wang, J. Design of p53-derived peptides with cytotoxicity on breast cancer. Amino Acids, 2014, 46, 2015-2024.
Wu, D.; Gu, Q.; Zhao, N.; Xia, F.; Li, Z. Structure-based rational design of peptide hydroxamic acid inhibitors to target tumor necrosis factor-α converting enzyme as potential therapeutics for hepatitis. J. Drug Target., 2015, 23, 936-942.
Zhuo, Z.H.; Sun, Y.Z.; Jin, P.N.; Li, F.Y.; Zhang, Y.L.; Wang, H.L. Selective targeting of MAPK family kinases JNK over p38 by rationally designed peptides as potential therapeutics for neurological disorders and epilepsy. Mol. Biosyst., 2016, 12, 2532-2540.
Gulukota, K.; DeLisi, C. HLA allele selection for designing peptide vaccines. Genet. Anal., 1996, 13, 81-86.
Blythe, M.J.; Doytchinova, I.A.; Flower, D.R. JenPep: A database of quantitative functional peptide data for immunology. Bioinformatics, 2002, 18, 434-439.
Free, S.M.; Wilson, J.W. A mathematical contribution to structure-activity studies. J. Med. Chem., 1964, 7, 395-399.
Doytchinova, I.A.; Blythe, M.J.; Flower, D.R. Additive method for the prediction of protein-peptide binding affinity. Application to the MHC class I molecule HLA-A*0201. J. Proteome Res., 2002, 1, 263-272.
Doytchinova, I.A.; Walshe, V.A.; Jones, N.A.; Gloster, S.E.; Borrow, P.; Flower, D.R. Coupling in silico and in vitro analysis of peptide-MHC binding: A bioinformatic approach enabling prediction of superbinding peptides and anchorless epitopes. J. Immunol., 2004, 172, 7495-7502.
Doytchinova, I.A.; Flower, D.R. Toward the quantitative prediction of T-cell epitopes: CoMFA and CoMSIA studies of peptides with affinity for the class I MHC molecule HLA-A*0201. J. Med. Chem., 2001, 44, 3572-3581.
Doytchinova, I.A.; Flower, D.R. Physicochemical explanation of peptide binding to HLA-A*0201 major histocompatibility complex: A three-dimensional quantitative structure-activity relationship study. Proteins, 2002, 48, 505-518.
Peters, B.; Sidney, J.; Bourne, P.; Bui, H.H.; Buus, S.; Doh, G.; Fleri, W.; Kronenberg, M.; Kubo, R.; Lund, O.; Nemazee, D.; Ponomarenko, J.V.; Sathiamurthy, M.; Schoenberger, S.; Stewart, S.; Surko, P.; Way, S.; Wilson, S.; Sette, A. The immune epitope database and analysis resource: From vision to blueprint. PLoS Biol., 2005, 3, e91.
Toseland, C.P.; Clayton, D.J.; McSparron, H.; Hemsley, S.L.; Blythe, M.J.; Paine, K.; Doytchinova, I.A.; Guan, P.; Hattotuwagama, C.K.; Flower, D.R. AntiJen: A quantitative immunology database integrating functional, thermodynamic, kinetic, biophysical, and cellular data. Immunome Res., 2005, 1, 4.
Rammensee, H.; Bachmann, J.; Emmerich, N.P.; Bachor, O.A.; Stevanović, S. SYFPEITHI: Database for MHC ligands and peptide motifs. Immunogenetics, 1999, 50, 213-219.
Ren, Y.; Chen, X.; Feng, M.; Wang, Q.; Zhou, P. Gaussian process: A promising approach for the modeling and prediction of peptide binding affinity to MHC proteins. Protein Pept. Lett., 2011, 18, 670-678.
Ren, Y.; Wu, B.; Pan, Y.; Lv, F.; Kong, X.; Luo, X.; Li, Y.; Yang, Q. Characterization of the binding profile of peptide to Transporter Associated with Antigen Processing (TAP) using Gaussian process regression. Comput. Biol. Med., 2011, 41, 865-870.
Zhang, G.L.; Ansari, H.R.; Bradley, P.; Cawley, G.C.; Hertz, T.; Hu, X.; Jojic, N.; Kim, Y.; Kohlbacher, O.; Lund, O.; Lundegaard, C.; Magaret, C.A.; Nielsen, M.; Papadopoulos, H.; Raghava, G.P.; Tal, V.S.; Xue, L.C.; Yanover, C.; Zhu, S.; Rock, M.T.; Crowe, J.E.; Panayiotou, C.; Polycarpou, M.M.; Duch, W.; Brusic, V. Machine learning competition in immunology - prediction of HLA class I binding peptides. J. Immunol. Methods, 2011, 374, 1-4.
Yordanov, V.; Dimitrov, I.; Doytchinova, I. Proteochemometrics for the prediction of binding to the MHC proteins. Lett. Drug Des. Discov., 2017, 14, 2-9.
Bhattacharya, R.; Sivakumar, A.; Tokheim, C.; Guthrie, V.B.; Anagnostou, V.; Velculescu, V.E.; Karchin, R. Evaluation of machine learning methods to predict peptide binding to MHC class I proteins. bioRxiv, 2017. [In press, doi: 10.1101/154757].
Luo, H.; Ye, H.; Ng, H.W.; Shi, L.; Tong, W.; Mendrick, D.L.; Hong, H. Machine learning methods for predicting HLA-peptide binding activity. Bioinform. Biol. Insights, 2015, 9(Suppl. 3), 21-29.
Söllner, J. Computational peptide vaccinology. Methods Mol. Biol., 2015, 1268, 291-312.
Li, S.S. Specificity and versatility of SH3 and other proline-recognition domains: Structural basis and implications for cellular signal transduction. Biochem. J., 2005, 390, 641-653.
Feng, S.; Chen, J.K.; Yu, H.; Simon, J.A.; Schreiber, S.L. Two binding orientations for peptides to the Src SH3 domain: Development of a general model for SH3-ligand interactions. Science, 1994, 266, 1241-1247.
Landgraf, C.; Panni, S.; Montecchi-Palazzi, L.; Castagnoli, L.; Schneider-Mergener, J.; Volkmer-Engert, R.; Cesareni, G. Protein interaction networks by proteome peptide scanning. PLoS Biol., 2004, 2, e14.
Hou, T.; Zhang, W.; Case, D.A.; Wang, W. Characterization of domain-peptide interaction interface: A case study on the amphiphysin-1 SH3 domain. J. Mol. Biol., 2008, 376, 1201-1214.
Hou, T.; Xu, Z.; Zhang, W.; McLaughlin, W.A.; Case, D.A.; Xu, Y.; Wang, W. Characterization of domain-peptide interaction interface: A generic structure-based model to decipher the binding specificity of SH3 domains. Mol. Cell. Proteomics, 2009, 8, 639-649.
Hou, T.; Li, N.; Li, Y.; Wang, W. Characterization of domain-peptide interaction interface: Prediction of SH3 domain-mediated protein-protein interaction network in yeast by generic structure-based models. J. Proteome Res., 2012, 11, 2982-2995.
Cai, J.; Ou, R.; Xu, Y.S.; Yang, L.; Lin, Z.; Shu, M. Modeling and predicting interactions between the human amphiphysin SH3 domains and their peptide ligands based on amino acid information. J. Pept. Sci., 2010, 16, 627-632.
Liu, L.; He, D.; Yang, S.; Xu, Y. Applying chemometrics approaches to model and predict the binding affinities between the human amphiphysin SH3 domain and its peptide ligands. Protein Pept. Lett., 2010, 17, 246-253.
Wu, G.; Zhang, Z.L.; Fu, C.J.; Lv, F.L.; Tian, F.F. Proteome-wide inference of human endophilin 1-binding peptides. Protein Pept. Lett., 2012, 19, 1094-1102.
Fu, C.; Wu, G.; Lv, F.; Tian, F. Structure-based characterization of the binding of peptide to the human endophilin-1 Src homology 3 domain using position-dependent noncovalent potential analysis. J. Mol. Model., 2012, 18, 2153-2161.
Zhou, P.; Tian, F.; Wu, Y.; Li, L.; Shang, Z. Quantitative Sequence-activity Model (QSAM): applying QSAR strategy to model and predict bioactivity and function of peptides, proteins and nucleic acids. Curr. Comput. Aided Drug Des., 2008, 4, 311-321.
Zhou, P.; Tian, F.; Chen, X.; Shang, Z. Modeling and prediction of binding affinities between the human amphiphysin SH3 domain and its peptide ligands using genetic Algorithm-Gaussian processes. Biopolymers, 2008, 90, 792-802.
Hou, T.; McLaughlin, W.; Lu, B.; Chen, K.; Wang, W. Prediction of binding affinities between the human amphiphysin-1 SH3 domain and its peptide ligands using homology modeling, molecular dynamics and molecular field analysis. J. Proteome Res., 2006, 5, 32-43.
He, P.; Wu, W.; Yang, K.; Jing, T.; Liao, K.L.; Zhang, W.; Wang, H.D.; Hua, X. Exploring the activity space of peptides binding to diverse SH3 domains using principal property descriptors derived from amino acid rotamers. Biopolymers, 2011, 96, 288-301.
He, P.; Wu, W.; Wang, H.D.; Yang, K.; Liao, K.L.; Zhang, W. Toward quantitative characterization of the binding profile between the human amphiphysin-1 SH3 domain and its peptide ligands. Amino Acids, 2010, 38, 1209-1218.
Ivanciuc, O. Machine learning Quantitative Structure-activity Relationships (QSAR) for peptides binding to the human amphiphysin-1 SH3 domain. Curr. Proteomics, 2009, 6, 289-302.
Lee, H.J.; Zheng, J.J. PDZ domains and their binding partners: Structure, specificity, and modification. Cell Commun. Signal., 2010, 8, 8.
Jin, R.; Ma, Y.; Qin, L.; Ni, Z. Structure-based prediction of domain-peptide binding affinity by dissecting residue interaction profile at complex interface: A case study on CAL PDZ domain. Protein Pept. Lett., 2013, 20, 1018-1028.
Zhang, L.; Shao, C.; Zheng, D.; Gao, Y. An integrated machine learning system to computationally screen protein databases for protein binding peptide ligands. Mol. Cell. Proteomics, 2006, 5, 1224-1232.
Wiedemann, U.; Boisguerin, P.; Leben, R.; Leitner, D.; Krause, G.; Moelling, K.; Volkmer-Engert, R.; Oschkinat, H. Quantification of PDZ domain specificity, prediction of ligand affinity and rational design of super-binding peptides. J. Mol. Biol., 2004, 343, 703-718.
Vincentelli, R.; Luck, K.; Poirson, J.; Polanowska, J.; Abdat, J.; Blémont, M.; Turchetto, J.; Iv, F.; Ricquier, K.; Straub, M.L.; Forster, A.; Cassonnet, P.; Borg, J.P.; Jacob, Y.; Masson, M.; Nominé, Y.; Reboul, J.; Wolff, N.; Charbonnier, S.; Travé, G. Quantifying domain-ligand affinities and specificities by high-throughput holdup assay. Nat. Methods, 2015, 12, 787-793.
Jones, R.B.; Gordus, A.; Krall, J.A.; MacBeath, G. A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature, 2006, 439, 168-174.
Wunderlich, Z.; Mirny, L.A. Using genome-wide measurements for computational prediction of SH2-peptide interactions. Nucleic Acids Res., 2009, 37, 4629-4641.
Panni, S.; Montecchi-Palazzi, L.; Kiemer, L.; Cabibbo, A.; Paoluzi, S.; Santonico, E.; Landgraf, C.; Volkmer-Engert, R.; Bachi, A.; Castagnoli, L.; Cesareni, G. Combining peptide recognition specificity and context information for the prediction of the 14-3-3-mediated interactome in S. cerevisiae and H. sapiens. Proteomics, 2011, 11, 128-143.
Ren, Y.; Chen, S.; Zou, X.; Tian, F.; Zhou, P. Use of Gaussian process to model and predict domain-peptide recognition and interaction. Sci. Sin. Chim., 2012, 42, 1179-1189.
Tian, F.; Tan, R.; Guo, T.; Zhou, P.; Yang, L. Fast and reliable prediction of domain-peptide binding affinity using coarse-grained structure models. Biosystems, 2013, 113, 40-49.
Hilpert, K.; Winkler, D.F.; Hancock, R.E. Peptide arrays on cellulose support: SPOT synthesis, a time and cost efficient method for synthesis of large numbers of peptides in a parallel and addressable fashion. Nat. Protoc., 2007, 2, 1333-1349.
Harndahl, M.; Rasmussen, M.; Roder, G.; Pedersen, D.I.; Sørensen, M.; Nielsen, M.; Buus, S. Peptide-MHC class I stability is a better predictor than peptide affinity of CTL immunogenicity. Eur. J. Immunol., 2012, 42, 1405-1416.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [170 - 176]
Pages: 7
DOI: 10.2174/1389200219666181012151944
Price: $58

Article Metrics

PDF: 19