Identifying Ligand-receptor Interactions via an Integrated Fuzzy Model

Author(s): Chang Xu, Yijie Ding, Limin Jiang, Cong Shen, Gaoyan Zhang*, Xuyao Yu*

Journal Name: Current Proteomics

Volume 17 , Issue 4 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Background: The ligand-receptor interaction plays an important role in signal transduction required for cellular differentiation, proliferation, and immune response process. The analysis of ligand-receptor interactions is helpful to provide a deeper understanding of cellular proliferation/ differentiation and other cell processes.

Methods: The computational technique would be used to promote ligand-receptor interactions research in future proteomics research. In this paper, we propose a novel computational method to predict ligand-receptor interactions from amino acid sequences by a machine learning approach. We extract features from ligand and receptor sequences by Histogram of Oriented Gradient (HOG) and Discrete Cosine Transform (DCT). Then, these features are fed into the Fuzzy C-Means (FCM) clustering algorithm for clustering, and also we get multiple training subsets to generate the same number of sub-classifiers. We choose an optimal sub-classifier for predicting ligand-receptor interactions according to the similarity from one sample to training subsets.

Observations: In order to verify the performance, we perform five-fold cross-validation experiments on a ligand-receptor interactions dataset and achieve 80.08% accuracy, 82.98% sensitivity and 80.02% specificity. Then, we test our extracted feature method on two Protein-Protein Interactions (PPIs) datasets, and achieve accuracies of 93.79% and 87.46%, respectively.

Conclusion: Our proposed method can be a useful tool for identifying of ligand-receptor interactions. Related data sets and source code are available at git.

Keywords: Ligand-receptor interactions, feature extraction, substitution matrix representation, discrete cosine transform, support vector machine, source code.

Yarimizu, M.; Cao, W.; Komiyama, Y.; Ueki, K.; Nakamura, S.; Sumikoshi, K.; Terada, T.K.S. Tyrosine kinase ligand-receptor pair prediction by using support vector machine. Adv. Bioinform., 2015, 36(9) 528097
Komiyama, Y.; Banno, M.; Ueki, K.; Saad, G.; Shimizu, K. Automatic generation of bioinformatics tools for predicting protein-ligand binding sites. Bioinformatics, 2015, 32(6), 901-907.
Iacucci, E.; Ojeda, F.; Moor, B.D.; Moreau, Y. Predicting receptor-ligand pairs through kernel learning. BMC Bioinformatics, 2011, 12(1), 1-8.
Suykens, J.A.; Vandewalle, J.; De, M.B. Optimal control by least squares support vector machines. Neural Netw., 2001, 14, 23-35.
Baldassi, C.; Zamparo, M.; Feinauer, C.; Procaccini, A.; Zecchina, R.; Weigt, M.; Pagnani, A. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS One, 2014, 9(3) e92721
Lukas, B.; Erik, V.N. Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method. Mol. Sys. Biol., 2008, 4(1), 165-178.
Pazos, F.; Ranea, J.A.G.; Juan, D.; Sternberg, M.J.E. Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J. Mol. Biol., 2005, 352, 1002-1015.
Pazos, F.; Valencia, A. Protein Engineering., 2001, 14(9), 609-614.
David, J.; Florencio, P.; Alfonso, V. High-confidence prediction of global interactomes based on genome-wide coevolutionary networks. Proc. Natl. Acad. Sci., 2008, 105(3), 934-939.
Alfonso, V.; Florencio, P. Computational methods for the prediction of protein interaction. Curr. Opin. Struct. Biol., 2002, 12(3), 368-373.
Ding, Y.J.; Tang, J.J.; Guo, F. Identification of residue-residue contacts using a novel coevolution-based method. Curr. Proteomics, 2016, 13(2), 122-129.
Guo, F.; Ding, Y.J.; Li, Z.; Tang, J.J. Identification of protein-protein interactions by detecting correlated mutation at the interface. J. Chem. Info. Model., 2015, 55(9), 2042-2049.
David, D.J.; Florencio, P.; Alfonso, V. Emerging methods in protein co-evolution. Nat. Rev. Genet., 2013, 14(4), 249-261.
Daraselia, N.; Yuryev, A.; Egorov, S.; Novichkova, S.; Nikitin, A.; Mazo, I. Extracting human protein interactions from MEDLINE using a full-sentence parser. Bioinformatics, 2004, 20(5), 604-611.
Jang, H.; Lim, J.; Lim, J.H.; Park, S.J.; Lee, K.C.; Park, S.H. Finding the evidence for protein-protein interactions from PubMed abstracts. Bioinformatics, 2006, 22(14), e220-226.
Guo, Y.; Yu, L.; Wen, Z.; Li, M. Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res., 2008, 36(9), 3025-3030.
Shen, J.; Zhang, J.; Luo, X.; Zhu, W.; Yu, K.; Chen, K.; Li, Y.; Jiang, H. Predicting protein-protein interactions based only on sequences information. Proc. Natl. Acad. Sci., 2007, 104(11), 4337-4341.
Zhou, Y.Z.; Gao, Y.; Zheng, Y.Y. Prediction of protein-protein interactions using local description of amino acid sequence. Adv. Comp. Sci. Edu. Appli, 2011, 202, 254-262.
Yang, L.; Xia, J.F.; Gui, J. Prediction of protein-protein interactions from protein sequence using local descriptors. Protein Pept. Lett., 2010, 17, 1085-1090.
You, Z.H.; Zhu, L.; Zheng, C.H.; Yu, H.J.; Deng, S.P. Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set. BMC Bioinformatics, 2014, 15(S15), 9.
You, Z.H.; Lei, Y.K.; Zhu, L.; Xia, J.; Wang, B. Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinformatics, 2013, 14(5), 69-75.
Martin, S.; Roe, D.; Faulon, J.L. Predicting protein-protein interactions using signature products. Bioinformatics, 2005, 21, 218-226.
Bock, J.R.; Gough, D.A. Whole-proteome interaction mining. Bioinformatics, 2003, 19, 125-134.
Ding, Y.J.; Tang, J.J.; Guo, F. Identification of protein-protein interactions via a novel matrix-based sequence representation model with amino acid contact information. Int. J. Mol. Sci., 2016, 17(10), 1623-1636.
Nanni, L. Hyperplanes for predicting protein-protein interactions. Neurocomputing, 2005, 69, 257-263.
Nanni, L.; Lumini, A. An ensemble of K-local hyperplanes for predicting protein-protein interactions. Bioinformatics, 2006, 22, 1207-1210.
Jain, A.K.; Murty, M.N.; Flynn, P.J. Data clustering: a review. ACM Comput. Surv., 1999, 31(3), 264-323.
Liu, Y.; Hou, T.; Liu, F. Improving fuzzy C-means method for unbalanced dataset. Electron. Lett., 2015, 51(23), 1880-1882.
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn., 1995, 20(3), 273-297.
Breiman, L. Random forests. Mach. Learn., 2001, 45(1), 5-32.
Thomas, G.G.; David, E. Bioinformatic identification of potential autocrine signaling loops in cancers from gene expression profiles. Nat. Genet., 2001, 29(3), 295-300.
Salwinski, L.; Miller, C.S.; Smith, A.J.; Pettit, F.K.; Bowie, J.U.E.A. The database of interacting proteins: 2004 update. Nucleic Acids Res., 2004, 32, 449-451.
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; P.E., B. The protein data bank. Nucleic Acids Res., 2000, 28(1), 235-242.
Nanni, L.; Brahnam, S.; Lumini, A. Wavelet images and Chou’s pseudo amino acid composition for protein classification. Amino Acids, 2012, 43(2), 657-665.
Nanni, L.; Brahnam, S.; Ghidoni, S.; Menegatti, E.T.B. Different approaches for extracting information from the co-occurrence matrix. PLoS One, 2013, 8(12) 83554
Nanni, L.; Lumini, A.; Brahnam, S. An empirical study of different approaches for protein classification. ScientificWorldJournal, 2014, 62 236717
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. IEEE Conf. Comp. Vision Pattern Recog, 2005, 1(12), pp. 886-893.
Ludwig, O.; Delgado, D.; Goncalves, V.U.N. Trainable classifier-fusion schemes: an application to pedestrian detection. 12th Int. IEEE Conf. Intell.Transport. Sys., 2009, 432-437.
Guo, F.; Li, S.C. L.W. P-binder: a system for the protein-protein binding sites identification. Bioinform. Res. Appli, 2012, 7292, 127-138.
Yu, X.Q.; Zheng, X.Q.; Liu, Y.G.; Dou, Y.C.; Wang, J. Predicting subcellular location of apoptosis proteins with pseudo amino acid composition: approach from amino acid substitution matrix and auto covariance transformation. Amino Acids, 2012, 42(5), 1619-1625.
Ahmed, N.; Natarajan, T.; Rao, K.R. Discrete cosine transform. IEEE Comp. Soc, 1974, 23, 90-93.
Mcallister, M.N. Fuzzy logic with engineering applications (Timothy Ross); McGraw-Hill, 2006.
Groenen, P.J.F.; Mathar, R.; Heiser, W.J. The majorization approach to multidimensional scaling for Minkowski distances. J. Classif., 1995, 12(1), 3-19.
Leslie, C.S.; Eskin, E.; Cohen, A.; Weston, J.; Noble, W.S. Mismatch string kernels for discriminative protein classification. Bioinformatics, 2004, 20, 467-476.
Furey, T.S.; Cristianini, N.; Duffy, N.; Bednarski, D.W.; Schummer, M.; Haussler, D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 2000, 16, 906-914.
Chang, C.C.; Lin, C.J. Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Tech., 2011, 2(3), 389-396.
Huttenlocher, D.P.; Klanderman, G.A.; Rucklidge, W.A. Comparing images using the hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell., 1993, 15(9), 850-863.
Efron, B. Bootstrap methods: another look at the jackknife. Ann. Stat., 1979, 7(1), 1-26.
Tao, D.; Tang, X.; Li, X.; Wu, X. Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell., 2006, 28(7), 1088-1099.
Yijie, D.; Jijun, T.; Fei, G. Identification of drug-side effect association via multiple information integration with centered kernel alignment. Neurocomputing, 2018.
[ 2018.10.028]
Jian, S.; Jijun, T.; Fei, G. Identification of inhibitors of MMPS enzymes via a novel computational approach. Int. J. Biol. Sci., 2018, 14(8), 863-871.
Yijie, D.; Jijun, T.; Fei, G. Identification of drug-target interactions via multiple information integration. Inf. Sci., 2017, 418, 546-560.
Yijie, D.; Jijun, T.; Fei, G. Identification of protein-ligand binding sites by sequence information and ensemble classifier. J. Chem. Info. Model., 2017, 57(12), 3149-3161.
Zhao, L.; Jijun, T.; Fei, G. Learning from real imbalanced data of 14-3-3 proteins binding specificity. Neurocomputing, 2016, 217, 1-9.
Yijie, D.; Jijun, T.; Fei, G. Predicting protein-protein interactions via multivariate mutual information of protein sequences. BMC Bioinformatics, 2016, 17(1), 398.
Zhao, L.; Yilei, Z.; Gaofeng, P.; Jijun, T.; Fei, G. A novel peptide binding prediction approach for HLA-DR molecule based on sequence and structural information. BioMed. Res. Int., 2016, 2016 3832176
Yijie, D.; Jijun, T.; Fei, G. Identification of protein-protein interactions via a novel matrix-based sequence representation model with amino acid contact. Int. J. Mol. Sci., 2016, 17(10), 1623.
Yijie, D.; Jijun, T.; Fei, G. Identification of residue-residue contacts using a novel coevolution-based method. Curr. Proteomics, 2016, 13(2), 122-129.
Fei, G.; Yijie, D.; Shuai, C.L.; Chao, Chao, S.; Wang, L. Protein-protein interface prediction based on hexagon structure similarity. Comput. Biol. Chem., 2016, 63, 83-88.
Fei, G.; Yijie, D.; Zhao, L.; Jijun, T. Identification of protein-protein interactions by detecting correlated mutation at the interface. J. Chem. Inf. Model., 2015, 55(9), 2042-2049.
Fei, G.; Shuai, C.L.; Zhexue, W.; Daming, Z.; Chao, S.; Lusheng, W. Structural neighboring property for identifying protein-protein binding sites. BMC Syst. Biol., 2015, 9(5), S3.
Fei, G.; Shuai, C.L.; Pufeng, Fei, G.; Shuai, C.L.; Du, P.; Wang, L.; Lusheng, W. Probabilistic models for capturing more physicochemical properties on protein-protein interface. J. Chem. Inf. Model., 2014, 54(6), 1798-1809.
Fei, G.; Shuai, C.L.; Du, P.; Wang, L. Identifying protein-protein binding sites with a combined energy function. Curr. Protein Pept. Sci., 2014, 15(6), 540-552.
Fei, G.; Shuai, C.L.; Ma, W.; Wang, L. Detecting protein conformational changes in interactions via scaling known structures. J. Comput. Biol., 2013, 20(10), 765-779.
Fei, G.; Shuai, C.L.; Wang, L; Zhu, D. Protein-protein binding site identification by enumerating the configurations. BMC Bioinformatics, 2012, 13(158), 13. Article number: 158
Guo, F.; Wang, L. Computing the protein binding sites. BMC Bioinformatics, 2012, 13(Supp10), 13. Article number: S2
Guo, F.; Li, S.C.; Wang, L. Protein-protein binding sites prediction by 3D structural similarities. J. Chem. Info. Mod., 2011, 51(12), 3287-3294.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Page: [287 - 301]
Pages: 15
DOI: 10.2174/1570164616666190306151423
Price: $25

Article Metrics

PDF: 13