Generic placeholder image

Letters in Organic Chemistry

Editor-in-Chief

ISSN (Print): 1570-1786
ISSN (Online): 1875-6255

Research Article

BRAda: A Robust Method for Identification of Pre-microRNAs by Combining Adaboost Framework with BP and RF

Author(s): Ningyi Zhang, Ying Zhang, Tianyi Zhao, Jun Ren, Yangmei Cheng and Yang Hu*

Volume 14, Issue 9, 2017

Page: [690 - 695] Pages: 6

DOI: 10.2174/1570178614666170221144619

Price: $65

Abstract

Background: MicroRNAs (miRNAs) are a set of non-coding, short (approximately 21nt) RNAs that play an important role as a regulator in biological processes in the cells. The identification and discovery of pre-miRNAs are beneficial in understanding the regulatory process, the functions of miRNAs and other genes, and furthermore in biological evolution.

Methods: Machine learning method has been a powerful technology in distinguishing the real premiRNAs from other hairpin-like sequences (pseudo pre-miRNAs). However, most of the commonly used classifiers are not promising in predicting performances on independent testing data sets. To overcome this, we proposed a novel BRAda algorithm integrating BP neural network and random forest classifier based on two balanced training sets. By distributing weights to these classifiers and the proposed 98-dimensional features, we obtained a strong classifier with high-accuracy and stability. Furthermore, based on the novel classifier we proposed, two independent testing sets (undated human and non-human pre-miRNAs) were employed to evaluate the prediction performance.

Results: The novel method BRAda algorithm is significantly outperformed the other methods in identifying both human and non-human pre-miRNAs.

Conclusion: The novel algorithm integrated BP neural network and random forest classifier based on two balanced training sets. Compared with other state-of-art machine-learning methods, the performance of BRAda was perfect (the ACC is over 99%) according to the validation. Besides, though the algorithm was trained by human gene sets, the prediction performance on non-human testing sets was also excellent (the average ACC is over 97%), which means the method not only has high stability but also robustness. By experiments and validation, the authors showed the method is an effective tool for pre-miRNA identification.

Keywords: Biological process, BRAda, BP neural network, genes, Pre-miRNA identification, random forest.

Graphical Abstract

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy