A Survey of Network Representation Learning Methods for Link Prediction in Biological Network

Author(s): Jiajie Peng, Guilin Lu, Xuequn Shang*

Journal Name: Current Pharmaceutical Design

Volume 26 , Issue 26 , 2020

Become EABM
Become Reviewer
Call for Editor


Background: Networks are powerful resources for describing complex systems. Link prediction is an important issue in network analysis and has important practical application value. Network representation learning has proven to be useful for network analysis, especially for link prediction tasks.

Objective: To review the application of network representation learning on link prediction in a biological network, we summarize recent methods for link prediction in a biological network and discuss the application and significance of network representation learning in link prediction task.

Method & Results: We first introduce the widely used link prediction algorithms, then briefly introduce the development of network representation learning methods, focusing on a few widely used methods, and their application in biological network link prediction. Existing studies demonstrate that using network representation learning to predict links in biological networks can achieve better performance. In the end, some possible future directions have been discussed.

Keywords: Biological network, link prediction, network analysis, network representation learning, algorithms, development.

Xue H, Peng J, Shang X. Towards gene function prediction via multi-networks representation learning. Proc Conf AAAI Artif Intell 2019; 33: 10069-70.
Huang Q, Wu LY, Zhang XS. An efficient network querying method based on conditional random fields. Bioinformatics 2011; 27(22): 3173-8.
[http://dx.doi.org/10.1093/bioinformatics/btr524] [PMID: 21926127]
Szklarczyk D, Franceschini A, Wyder S, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res 2015; 43(Database issue): D447-52.
[http://dx.doi.org/10.1093/nar/gku1003] [PMID: 25352553]
Yildirim MA, Goh KI, Cusick ME, Barabási AL, Vidal M. Drug-target network. Nat Biotechnol 2007; 25(10): 1119-26.
[http://dx.doi.org/10.1038/nbt1338] [PMID: 17921997]
Collins FS, Green ED, Guttmacher AE, Guyer MS. US National Human Genome Research Institute. A vision for the future of genomics research. Nature 2003; 422(6934): 835-47.
[http://dx.doi.org/10.1038/nature01626] [PMID: 12695777]
Zhao S, Li S. A co-module approach for elucidating drug-disease associations and revealing their molecular basis. Bioinformatics 2012; 28(7): 955-61.
[http://dx.doi.org/10.1093/bioinformatics/bts057] [PMID: 22285830]
Peng J, Lu J, Hoh D, et al. Identifying emerging phenomenon in long temporal phenotyping experiments. Bioinformatics 2020; 36(2): 568-77.
[http://dx.doi.org/10.1186/1756-0381-1-12]] [PMID: 19040716]
Zhang Z, Zhang J, Fan C, Tang Y, Deng L. KATZLGO: large-scale prediction of LncRNA functions by using the KATZ measure based on multiple networks. IEEE/ACM Trans Comput Biol Bioinformatics 2019; 16(2): 407-16.
[http://dx.doi.org/10.1109/TCBB.2017.2704587] [PMID: 28534780]
Deng L, Wang J, Zhang J. Predicting gene ontology function of human micrornas by integrating multiple networks. Front Genet 2019; 10: 3.
[http://dx.doi.org/10.3389/fgene.2019.00003] [PMID: 30761178]
Zhang J, Zhang Z, Wang Z, Liu Y, Deng L. Ontological function annotation of long non-coding RNAs through hierarchical multi-label classification. Bioinformatics 2018; 34(10): 1750-7.
[http://dx.doi.org/10.1093/bioinformatics/btx833] [PMID: 29293953]
Pan Y, Wang Z, Zhan W, Deng L. Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach. Bioinformatics 2018; 34(9): 1473-80.
[http://dx.doi.org/10.1093/bioinformatics/btx822] [PMID: 29281004]
Barabási AL, Oltvai ZN. Network biology: understanding the cell’s functional organization. Nat Rev Genet 2004; 5(2): 101-13.
[http://dx.doi.org/10.1038/nrg1272] [PMID: 14735121]
Ideker T, Sharan R. Protein networks in disease. Genome Res 2008; 18(4): 644-52.
[http://dx.doi.org/10.1101/gr.071852.107] [PMID: 18381899]
Wang X, Wei X, Thijssen B, Das J, Lipkin SM, Yu H. Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat Biotechnol 2012; 30(2): 159-64.
[http://dx.doi.org/10.1038/nbt.2106] [PMID: 22252508]
Berman HM, Bourne PE, Westbrook J, et al. The protein data bank[M]//Protein Structure. CRC press 2003; 394-410..
Cheng L, Yang H, Zhao H, et al. MetSigDis: a manually curated resource for the metabolic signatures of diseases. Brief Bioinform 2019; 20(1): 203-9.
[http://dx.doi.org/10.1093/bib/bbx103] [PMID: 28968812]
Barabási AL, Gulbahce N, Loscalzo J. Network medicine: a network-based approach to human disease. Nat Rev Genet 2011; 12(1): 56-68.
[http://dx.doi.org/10.1038/nrg2918] [PMID: 21164525]
Cheng L, Jiang Y, Ju H, et al. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics 2018; 19(1)(Suppl. 1): 919.
[http://dx.doi.org/10.1186/s12864-017-4338-6] [PMID: 29363423]
Cheng L, Sun J, Xu W, Dong L, Hu Y, Zhou M. OAHG: an integrated resource for annotating human genes with multi-level ontologies. Sci Rep 2016; 6: 34820.
[http://dx.doi.org/10.1038/srep34820] [PMID: 27703231]
Vidal M, Cusick ME, Barabási AL. Interactome networks and human disease. Cell 2011; 144(6): 986-98.
[http://dx.doi.org/10.1016/j.cell.2011.02.016] [PMID: 21414488]
Rual JF, Venkatesan K, Hao T, et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005; 437(7062): 1173-8.
[http://dx.doi.org/10.1038/nature04209] [PMID: 16189514]
Stelzl U, Worm U, Lalowski M, et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005; 122(6): 957-68.
[http://dx.doi.org/10.1016/j.cell.2005.08.029] [PMID: 16169070]
He Z, Zhang J, Shi XH, et al. Predicting drug-target interaction networks based on functional groups and biological features. PLoS One 2010; 5(3)e9603
[http://dx.doi.org/10.1371/journal.pone.0009603] [PMID: 20300175]
Cheng F, Liu C, Jiang J, et al. Prediction of drug-target interactions and drug repositioning via network-based inference. PLOS Comput Biol 2012; 8(5)e1002503
[http://dx.doi.org/10.1371/journal.pcbi.1002503] [PMID: 22589709]
Arrell DK, Terzic A. Network systems biology for drug discovery. Clin Pharmacol Ther 2010; 88(1): 120-5.
[http://dx.doi.org/10.1038/clpt.2010.91] [PMID: 20520604]
Lü L, Zhou T. Link prediction in complex networks: A survey. Physica A 2011; 390(6): 1150-70.
Martínez V, Berzal F, Cubero J C. A survey of link prediction in complex networks ACM Computing Surveys (CSUR) 2017; 49(4): 69.
Taskar B, Wong MF, Abbeel P, et al. Link prediction in relational data. Advances Neural Inf Process Sys 2004; pp. 659-66.
Luo Y, Zhao X, Zhou J, et al. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat Commun 2017; 8(1): 573.
[http://dx.doi.org/10.1038/s41467-017-00680-8] [PMID: 28924171]
Campillos M, Kuhn M, Gavin AC, Jensen LJ, Bork P. Drug target identification using side-effect similarity. Science 2008; 321(5886): 263-6.
[http://dx.doi.org/10.1126/science.1158140] [PMID: 18621671]
Chen X, Liu MX, Yan GY. Drug-target interaction prediction by random walk on the heterogeneous network. Mol Biosyst 2012; 8(7): 1970-8.
[http://dx.doi.org/10.1039/c2mb00002d] [PMID: 22538619]
Sen P, Namata G, Bilgic M, et al. Collective classification in network data. AI Mag 2008; 29(3): 93-3.
Wang X, Cui P, Wang J, et al. Community preserving network embedding. Conference on Artificial Intelligence.
Patterns Herman I, Melançon G, Marshall MS. Graph visualization and navigation in information visualization: A survey. IEEE Trans Vis Comput Graph 2000; 6(1): 24-43.
Newman MEJ. Clustering and preferential attachment in growing networks. Phys Rev E Stat Nonlin Soft Matter Phys 2001; 64(2 Pt 2)025102
[http://dx.doi.org/10.1103/PhysRevE.64.025102] [PMID: 11497639]
Salton G, McGill M J. Introduction to modern information retrieval mcgraw-hill 1983.
Jaccard P. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaud Sci Nat 1901; 37: 547-79.
Adamic LA, Adar E. Friends and neighbors on the web. Soc Networks 2003; 25(3): 211-30.
Zhou T, Lü L, Zhang YC. Predicting missing links via local information. Eur Phys J B 2009; 71(4): 623-30.
Ou Q, Jin YD, Zhou T, Wang BH, Yin BQ. Power-law strength-degree correlation from resource-allocation dynamics on weighted networks. Phys Rev E Stat Nonlin Soft Matter Phys 2007; 75(2 Pt 1)021102
[http://dx.doi.org/10.1103/PhysRevE.75.021102] [PMID: 17358308]
Barabási A L, Albert R. Emergence of scaling in random networks science 1999; 286(5439): 509-12.
Holme P, Kim BJ, Yoon CN, Han SK. Attack vulnerability of complex networks. Phys Rev E Stat Nonlin Soft Matter Phys 2002; 65(5 Pt 2)056109
[http://dx.doi.org/10.1103/PhysRevE.65.056109] [PMID: 12059649]
Katz L. A new status index derived from sociometric analysis. Psychometrika 1953; 18(1): 39-43.
Hajek B. Hitting-time and occupation-time bounds implied by drift analysis with applications. Adv Appl Probab 1982; 14(3): 502-25.
Liben‐Nowell D, Kleinberg J. The link‐prediction problem for social networks. J Am Soc Inf Sci Technol 2007; 58(7): 1019-31.
Brin S, Page L. The anatomy of a large-scale hypertextual web search engine Computer networks and ISDN systems 1998; 30(1-7): 107-7..
Yu C, Zhao X, An L, et al. Similarity-based link prediction in social networks: A path and node combined approach. J Inf Sci 2017; 43(5): 683-95.
Dong Y, Chawla NV, Swami A. metapath2vec: Scalable representation learning for heterogeneous networks. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining ACM. 135-44.
Chang S, Han W, Tang J, et al. Heterogeneous network embedding via deep architectures. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM.
Yang C, Liu Z, Zhao D, et al. Network representation learning with rich text information. Twenty-Fourth International Joint Conference on Artificial Intelligence.
Chen H, Perozzi B, Al-Rfou R, et al. A tutorial on network embeddings 2018..
[http://dx.doi.org/arXiv preprint arXiv:1808.02590]
Wold S. Principal component analysis. Chemom Intell Lab Syst 1987; 2(1): 37-52.
Izenman AJ. Linear discriminant analysis Modern multivariate statistical techniques. New York, NY: Springer 2013; pp. 237-80.
Kruskal JB, Wish M. Multidimensional Scaling. Quantitative Applications in the Social Sciences 1978.
Tenenbaum JB, de Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science 2000; 290(5500): 2319-23.
[http://dx.doi.org/10.1126/science.290.5500.2319] [PMID: 11125149]
Roweis ST, Saul LK. Nonlinear dimensionality reduction by locally linear embedding. Science 2000; 290(5500): 2323-6.
[http://dx.doi.org/10.1126/science.290.5500.2323] [PMID: 11125150]
Roweis S T. Nonlinear dimensionality reduction by locally linear embedding science 2000; 290(5500): 2323-6..
Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model. J Mach Learn Res 2003; 3(6): 1137-55.
Pennington J, Socher R, Manning C D, et al. Glove: Global Vectors for Word Representation empirical methods in natural language processing 2014; 1532-43.
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality neural information processing systems. Adv Neural Inf Process Sys 2013; pp. 3111-9.
Mikolov T, Chen K, Corrado GS, et al. Efficient estimation of word representations in vector space. International conference on learning representations 2013.
Perozzi B, Al-Rfou R, Skiena S. learning of social representations. Proceedings of the 20th ACM SIGKDD. International conference on knowledge discovery and data mining ACM. 701-10.
Grover A, Leskovec J. Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining ACM. 855-64.
Tang J, Qu M, Wang M, et al. World Wide Web. International World Wide Web Conferences Steering Committee. Large-scale information network embedding. Proceedings of the 24th international conference on 1067-77.
Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality. Advances in n Neural iInformation processing systems 2013; 3111-9.
Wang D, Cui P, Zhu W. Structural deep network embedding. Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining ACM. 1225-34.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks 2016..
[http://dx.doi.org/arXiv preprint arXiv:1609.02907]
Kipf TN, Welling M. Douglas B L. The weisfeiler-lehman method and graph isomorphism testing. arXiv 2011; 1101: 5211.
[http://dx.doi.org/arXiv preprint arXiv:1101.5211]
Wang H, Wang J, Wang J, et al. Graph representation learning with generative adversarial nets. Thirty-Second AAAI Conference on Artificial Intelligence.
Arjovsky M, Chintala S, Bottou L. Wasserstein generative adversarial networks. International conference on machine learning.
Rubner Y, Tomasi C, Guibas LJ. The earth mover’s distance as a metric for image retrieval. Int J Comput Vis 2000; 40(2): 99-121.
Kullback S. Information theory and statistics. Courier Corporation 1997.
Hong H, Li X, Wang M. Gane: A generative adversarial network embedding. IEEE transactions on neural networks and learning systems 2019. Early access.
Abu-El-Haija S, Perozzi B, Al-Rfou R. Learning edge representations via low-rank asymmetric projections. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management ACM. 1787-96..
Bordes A, Usunier N, Garcia-Duran A, et al. Translating embeddings for modeling multi-relational data. Advances in n Neural iInformation processing systems 2013; 2787-95.
Wang Z, Zhang J, Feng J, et al. Knowledge graph embedding by translating on hyperplanes. Twenty-Eighth AAAI conference on artificial intelligence.
Lin Y, Liu Z, Sun M, et al. Learning entity and relation embeddings for knowledge graph completion. Twenty-ninth AAAI conference on artificial intelligence.
Yuan S, Wu X, Xiang Y. SNE: signed network embedding Pacific-Asia conference on knowledge discovery and data mining. Cham: Springer 2017; pp. 183-95.
Wang S, Tang J, Aggarwal C, et al. Signed network embedding in social media. Proceedings of the 2017 SIAM international conference on data mining Society for Industrial and Applied Mathematics. 327-5..
Duvenaud DK, Maclaurin D, Iparraguirre J, et al. Convolutional networks on graphs for learning molecular fingerprints. Advances in n Neural iInformation processing systems 2015; 2224-32.
Li Y, Tarlow D, Brockschmidt M. ,et al. Gated graph sequence neural networks arXiv 2015; 1511: 05493.
[http://dx.doi.org/arXiv preprint arXiv:1511.05493]
Yanardag P, Vishwanathan SVN. Deep graph kernels. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM. 1365-74.
Yang J, Li Z, Fan X, Cheng Y. Drug-disease association and drug-repositioning predictions in complex diseases using causal inference-probabilistic matrix factorization. J Chem Inf Model 2014; 54(9): 2562-9.
[http://dx.doi.org/10.1021/ci500340n] [PMID: 25116798]
Zhang W, Yue X, Lin W, et al. Predicting drug-disease associations by using similarity constrained matrix factorization. BMC Bioinformatics 2018; 19(1): 233.
[http://dx.doi.org/10.1186/s12859-018-2220-4] [PMID: 29914348]
Dai W, Liu X, Gao Y, et al. Matrix factorization-based prediction of novel drug indications by integrating genomic space. Computational and mathematical methods in medicine 2015; 2015
Zhang W, Chen Y, Li D, Yue X. Manifold regularized matrix factorization for drug-drug interaction prediction. J Biomed Inform 2018; 88: 90-7.
[http://dx.doi.org/10.1016/j.jbi.2018.11.005] [PMID: 30445219]
Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics 2018; 34(13): i457-66.
[http://dx.doi.org/10.1093/bioinformatics/bty294] [PMID: 29949996]
Ma T, Xiao C, Zhou J. Drug similarity integration through attentive multi-view graph auto-encoders. International joint conference on artificial intelligence 2018..
[http://dx.doi.org/arXiv preprint arXiv:1804.10850]
Peng J, Hui W, Li Q, et al. A learning-based framework for miRNA-disease association identification using neural networks. Bioinformatics 2019; 35(21): 4364-71.
[http://dx.doi.org/10.1093/bioinformatics/btz254] [PMID: 30977780]
Xue H, Peng J, Shang X. Integrating multi-network topology for gene function prediction using deep neural networks. bioRxiv 2019.532408
Zhu L, You ZH, Huang DS. Increasing the reliability of protein–protein interaction networks via non-convex semantic embedding. Neurocomputing 2013; 121: 99-107.
You ZH, Li X, Chan KCC. An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers. Neurocomputing 2017; 228: 277-82.
Wang YB, You ZH, Li X, et al. Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network. Mol Biosyst 2017; 13(7): 1336-44.
[http://dx.doi.org/10.1039/C7MB00188F] [PMID: 28604872]
Zong N, Kim H, Ngo V, Harismendy O. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations. Bioinformatics 2017; 33(15): 2337-44.
[http://dx.doi.org/10.1093/bioinformatics/btx160] [PMID: 28430977]
Zitnik M, Leskovec J. Predicting multicellular function through multi-layer tissue networks. Bioinformatics 2017; 33(14): i190-8.
[http://dx.doi.org/10.1093/bioinformatics/btx252] [PMID: 28881986]
Luo P, Li Y, Tian LP, Wu FX. Enhancing the prediction of disease-gene associations with multimodal deep learning. Bioinformatics 2019; 35(19): 3735-42.
[http://dx.doi.org/10.1093/bioinformatics/btz155] [PMID: 30825303]
Peng J, Guan J, Shang X. Predicting Parkinson’s disease genes based on node2vec and autoencoder. Front Genet 2019; 10: 226.
[http://dx.doi.org/10.3389/fgene.2019.00226] [PMID: 31001311]
Li Y, Kuwahara H, Yang P, et al. PGCN: Disease gene prioritization by disease and gene embedding through graph convolutional neural networks. bioRxiv 2019. 532226
Fan J, Cannistra A, Fried I, et al. Functional protein representations from biological networks enable diverse cross-species inference. Nucleic Acids Res 2019; 47(9): e51-1.
[http://dx.doi.org/10.1093/nar/gkz132] [PMID: 30847485]
Li X, Du N, Li H, et al. A deep learning approach to link prediction in dynamic networks Proceedings of the. 2014 SIAM International Conference on Data Mining Society for Industrial and Applied Mathematics. 289-97.
Huang Z, Lin DKJ. The time-series link prediction problem with applications in communication surveillance. INFORMS J Comput 2009; 21(2): 286-303.
Güneş İ, Gündüz-Öğüdücü Ş, Çataltepe Z. Link prediction using time series of neighborhood-based node similarity scores. Data Min Knowl Discov 2016; 30(1): 147-80.
Tylenda T, Angelova R, Bedathur S. Towards time-aware link prediction in evolving social networks. Proceedings of the 3rd workshop on social network mining and analysis ACM. 9
da Silva Soares PR, Prudêncio RBC. Time series based link prediction[C]//The 2012 international joint conference on neural networks (IJCNN) . IEEE 2012; 1-7..
Almansoori W, Gao S, Jarada TN, et al. Link prediction and classification in social networks and its application in healthcare and systems biology. Netw Model Anal Health Inform Bioinform 2012; 1(1-2): 27-36.
Xu B, Li L, Liu J, et al. Disappearing Link Prediction in Scientific Collaboration Networks IEEE Access 2018; 6: 69702-12..
Cheng L, Wang P, Tian R, et al. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res 2019; 47(D1): D140-4.
[http://dx.doi.org/10.1093/nar/gky1051] [PMID: 30380072]
Zeng X, Zhong Y, Lin W, et al. Predicting disease-associated circular RNAs using deep forests combined with positive-unlabeled learning methods. Briefings Bioinformatics 2019.
[http://dx.doi.org/10.1093/bib/bbz080] [PMID: 31612203]
Cheng L, Hu Y, Sun J, Zhou M, Jiang Q. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics 2018; 34(11): 1953-6.
[http://dx.doi.org/10.1093/bioinformatics/bty002] [PMID: 29365045]
Liao Z, Li D, Wang X, et al. Cancer diagnosis from isomiR expression with machine learning method. Curr Bioinform 2018; 13(1): 57-63.
Peng J, Wang X, Shang X. Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data. BMC Bioinformatics 2019; 20(8)(Suppl. 8): 284.
[http://dx.doi.org/10.1186/s12859-019-2769-6] [PMID: 31182005]
Qi R, Ma A, Ma Q, et al. Clustering and classification methods for single-cell RNA-sequencing data. Briefings in bioinformatics, 2019; 7
[http://dx.doi.org/10.1093/bib/bbz062] [PMID: 31271412]

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 11 August, 2020
Page: [3076 - 3084]
Pages: 9
DOI: 10.2174/1381612826666200116145057
Price: $65

Article Metrics

PDF: 28