Generic placeholder image

Protein & Peptide Letters


ISSN (Print): 0929-8665
ISSN (Online): 1875-5305

Discriminating Outer Membrane Proteins with Fuzzy K-Nearest Neighbor Algorithms Based on the General Form of Chou’s PseAAC

Author(s): Maqsood Hayat and Asifullah Khan

Volume 19 , Issue 4 , 2012

Page: [411 - 421] Pages: 11

DOI: 10.2174/092986612799789387


Outer membrane proteins (OMPs) play important roles in cell biology. In addition, OMPs are targeted by multiple drugs. The identification of OMPs from genomic sequences and successful prediction of their secondary and tertiary structures is a challenging task due to short membrane-spanning regions with high variation in properties. Therefore, an effective and accurate silico method for discrimination of OMPs from their primary sequences is needed. In this paper, we have analyzed the performance of various machine learning mechanisms for discriminating OMPs such as: Genetic Programming, K-nearest Neighbor, and Fuzzy K-nearest Neighbor (Fuzzy K-NN) in conjunction with discrete methods such as: Amino acid composition, Amphiphilic Pseudo amino acid composition, Split amino acid composition (SAAC), and hybrid versions of these methods. The performance of the classifiers is evaluated by two datasets using 5-fold crossvalidation. After the simulation, we have observed that Fuzzy K-NN using SAAC based-features makes it quite effective in discriminating OMPs. Fuzzy K-NN achieves the highest success rates of 99.00% accuracy for discriminating OMPs from non-OMPs and 98.77% and 98.28% accuracies from &lapha;-helix membrane and globular proteins, respectively on dataset1. While on dataset2, Fuzzy K-NN achieves 99.55%, 99.90%, and 99.81% accuracies for discriminating OMPs from non- OMPs, α-helix membrane, and globular proteins, respectively. It is observed that the classification performance of our proposed method is satisfactory and is better than the existing methods. Thus, it might be an effective tool for high throughput innovation of OMPs.

Keywords: AAC, Am-PseAAC, SAAC, genetic programming, K-nearest neighbor, Fuzzy K-NN

© 2022 Bentham Science Publishers | Privacy Policy