Outer membrane proteins (OMPs) play important roles in cell biology. In addition, OMPs are targeted by multiple
drugs. The identification of OMPs from genomic sequences and successful prediction of their secondary and tertiary
structures is a challenging task due to short membrane-spanning regions with high variation in properties. Therefore, an
effective and accurate silico method for discrimination of OMPs from their primary sequences is needed. In this paper, we
have analyzed the performance of various machine learning mechanisms for discriminating OMPs such as: Genetic
Programming, K-nearest Neighbor, and Fuzzy K-nearest Neighbor (Fuzzy K-NN) in conjunction with discrete methods
such as: Amino acid composition, Amphiphilic Pseudo amino acid composition, Split amino acid composition (SAAC),
and hybrid versions of these methods. The performance of the classifiers is evaluated by two datasets using 5-fold crossvalidation.
After the simulation, we have observed that Fuzzy K-NN using SAAC based-features makes it quite effective in
discriminating OMPs. Fuzzy K-NN achieves the highest success rates of 99.00% accuracy for discriminating OMPs from
non-OMPs and 98.77% and 98.28% accuracies from &lapha;-helix membrane and globular proteins, respectively on dataset1.
While on dataset2, Fuzzy K-NN achieves 99.55%, 99.90%, and 99.81% accuracies for discriminating OMPs from non-
OMPs, α-helix membrane, and globular proteins, respectively. It is observed that the classification performance of our
proposed method is satisfactory and is better than the existing methods. Thus, it might be an effective tool for high
throughput innovation of OMPs.