Using the Chou’s Pseudo Component to Predict the ncRNA Locations Based on the Improved K-Nearest Neighbor (iKNN) Classifier

Author(s): Chengyan Wu*, Qianzhong Li, Ru Xing, Guo-Liang Fan

Journal Name: Current Bioinformatics

Volume 15 , Issue 6 , 2020

Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Background: The non-coding RNA identification at the organelle genome level is a challenging task. In our previous work, an ncRNA dataset with less than 80% sequence identity was built, and a method incorporating an increment of diversity combining with support vector machine method was proposed.

Objective: Based on the ncRNA_361 dataset, a novel decision-making method-an improved KNN (iKNN) classifier was proposed.

Methods: In this paper, based on the iKNN algorithm, the physicochemical features of nucleotides, the degeneracy of genetic codons, and topological secondary structure were selected to represent the effective ncRNA characters. Then, the incremental feature selection method was utilized to optimize the feature set.

Results: The results of iKNN indicated that the decision-making method of mean value is distinctly superior to the traditional decision-making method of majority vote the Increment of Diversity Combining Support Vector Machine (ID-SVM). The iKNN algorithm achieved an overall accuracy of 97.368% in the jackknife test, when k=3.

Conclusion: It should be noted that the triplets of the structure-sequence mode under reading frames not only contains the entire sequence information but also reflects whether the base was paired or not, and the secondary structural topological parameters further describe the ncRNA secondary structure on the spatial level. The ncRNA dataset and the iKNN classifier are freely available at

Keywords: Organelle genome, non-coding RNA, open reading frame, spatial structure, feature selection, K-nearest neighbor method.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 11 November, 2020
Page: [563 - 573]
Pages: 11
DOI: 10.2174/1574893614666191003142406
Price: $65

Article Metrics

PDF: 15