Generic placeholder image

Protein & Peptide Letters

Editor-in-Chief

ISSN (Print): 0929-8665
ISSN (Online): 1875-5305

Prediction of Protein-protein Interactions Based on Feature Selection and Data Balancing

Author(s): Liang Liu, Wen-Cong Lu, Yu-Dong Cai, Kai-Yan Feng, Chunrong Peng and Yubei Zhu

Volume 20, Issue 3, 2013

Page: [336 - 345] Pages: 10

DOI: 10.2174/0929866511320030012

Price: $65

Abstract

Computational approaches are able to analyze protein-protein interactions (PPIs) from a different angle of view by complementing the experimental ones. And they are very efficient in determining whether two proteins can interact with each other. In this paper, KNNs (K-nearest neighbors) is applied to predict the PPIs by coding each protein with the physical and chemical properties of its residues, predicted secondary structures and amino acid compositions. mRMR (minimum-redundancy maximum-relevance) feature selection is adopted to select a compact feature set, features of which are considered to be important for the determination of PPI-nesses. Because the size of the negative dataset (containing non-interactive protein pairs) is much larger than that of the positive dataset (containing interactive protein pairs), the negative dataset is divided into 5 portions and each portion is combined with the positive dataset for one prediction. Thus 5 predictions are performed and the final results are obtained through voting. As a result, the prediction achieves an overall accuracy of 0.8369 with sensitivity of 0.7356. The predictor, developed by this research for the prediction of the fruit fly PPI-nesses, is available for public use at http://chemdata.shu.edu.cn/ppip.

Keywords: Bioinformatics, feature selection, KNNs, protein-protein interactions, unbalanced data, mRMR (minimum-redundancy maximum-relevance), PPI-nesses, negative dataset


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy