Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

An Empirical Study of Features Fusion Techniques for Protein-Protein Interaction Prediction

Author(s): Jiancang Zeng, Dapeng Li, Yunfeng Wu, Quan Zou and Xiangrong Liu*

Volume 11, Issue 1, 2016

Page: [4 - 12] Pages: 9

DOI: 10.2174/1574893611666151119221435

Price: $65

Abstract

With recent development of bioinformatics, the importance of understanding protein function has been widely acknowledged. Most proteins perform their functions by interacting with other proteins. Hence, it is urgent to explore the protein-protein interaction (PPI). At present, the prediction of PPIs is still a tough problem. Despite the fact that a variety of computational methods have been proposed to identify PPIs; unfortunately, most of them are complex and with low accuracy. Traditional methods extract features following two steps: firstly, they extract features from two proteins of a PPI; secondly, they regard two features as strings, and do concatenation operator. Concatenation is an outcome of an addition operation on strings. The concatenation operator increases redundancy features with the result being associated with the order of concatenation. Based on this, in this paper, we study the features fusion and features selection. The presented framework consists of three stages: in the first stage, we get the negative data set from off-the-shelf database. The reliability of negative data set of previous studies has not been of concern to us. While in the second stage, the n-gram frequency method was used to preprocess the PPIs sequences. The third one was applied to splice the final feature, and then the features were selected to find the optimal feature. Finally, an effective parameter for the Random Forest Classifier was selected. Experiments carried out on real data set showed that our features fusion method outperformed traditional methods in terms of protein-protein interaction prediction. The encouraging results can be helpful for future research in protein function. The web server of protein-protein interaction prediction is accessible at http://datamining.xmu.edu.cn/~zjcdm/Home.html.

Keywords: Features fusion, features selection, Random Forests, protein-protein interaction.

Graphical Abstract

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy