BRAda: A Robust Method for Identification of Pre-microRNAs by Combining Adaboost Framework with BP and RF

Author(s): Ningyi Zhang , Ying Zhang , Tianyi Zhao , Jun Ren , Yangmei Cheng , Yang Hu* .

Journal Name: Letters in Organic Chemistry

Volume 14 , Issue 9 , 2017

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: MicroRNAs (miRNAs) are a set of non-coding, short (approximately 21nt) RNAs that play an important role as a regulator in biological processes in the cells. The identification and discovery of pre-miRNAs are beneficial in understanding the regulatory process, the functions of miRNAs and other genes, and furthermore in biological evolution.

Methods: Machine learning method has been a powerful technology in distinguishing the real premiRNAs from other hairpin-like sequences (pseudo pre-miRNAs). However, most of the commonly used classifiers are not promising in predicting performances on independent testing data sets. To overcome this, we proposed a novel BRAda algorithm integrating BP neural network and random forest classifier based on two balanced training sets. By distributing weights to these classifiers and the proposed 98-dimensional features, we obtained a strong classifier with high-accuracy and stability. Furthermore, based on the novel classifier we proposed, two independent testing sets (undated human and non-human pre-miRNAs) were employed to evaluate the prediction performance.

Results: The novel method BRAda algorithm is significantly outperformed the other methods in identifying both human and non-human pre-miRNAs.

Conclusion: The novel algorithm integrated BP neural network and random forest classifier based on two balanced training sets. Compared with other state-of-art machine-learning methods, the performance of BRAda was perfect (the ACC is over 99%) according to the validation. Besides, though the algorithm was trained by human gene sets, the prediction performance on non-human testing sets was also excellent (the average ACC is over 97%), which means the method not only has high stability but also robustness. By experiments and validation, the authors showed the method is an effective tool for pre-miRNA identification.

Keywords: Biological process, BRAda, BP neural network, genes, Pre-miRNA identification, random forest.

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 14
ISSUE: 9
Year: 2017
Page: [690 - 695]
Pages: 6
DOI: 10.2174/1570178614666170221144619
Price: $58

Article Metrics

PDF: 13
HTML: 1
EPUB: 1
PRC: 1