Predicting Viral Protein Subcellular Localization with Chou's Pseudo Amino Acid Composition and Imbalance-Weighted Multi-Label K-Nearest Neighbor Algorithm

Author(s): Jun-Zhe Cao, Wen-Qi Liu, Hong Gu

Journal Name: Protein & Peptide Letters

Volume 19 , Issue 11 , 2012

Become EABM
Become Reviewer
Call for Editor


Machine learning is a kind of reliable technology for automated subcellular localization of viral proteins within a host cell or virus-infected cell. One challenge is that the viral protein samples are not only with multiple location sites, but also class-imbalanced. The imbalanced dataset often decreases the prediction performance. In order to accomplish this challenge, this paper proposes a novel approach named imbalance-weighted multi-label K-nearest neighbor to predict viral protein subcellular location with multiple sites. The experimental results by jackknife test indicate that the presented algorithm achieves a better performance than the existing methods and has great potentials in protein science.

Keywords: Class-imbalance, K-nearest neighbor, multi-label learning, pseudo amino acid composition, subcellular localization

open access plus

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2012
Published on: 16 September, 2012
Page: [1163 - 1169]
Pages: 7
DOI: 10.2174/092986612803216999

Article Metrics

PDF: 21