DeepFusion-RBP: Using Deep Learning to Fuse Multiple Features to Identify RNA-binding Protein Sequences

Author(s): Xu Wang, Shunfang Wang*, Haoyi Fu, Xiaoli Ruan, Xianjun Tang

Journal Name: Current Bioinformatics

Volume 16 , Issue 8 , 2021

Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Background: RNA-binding protein plays an important role in regulating splicing, RNA transport, and other post-transcriptional processes, identifying special RNA binding domains, and interacting with RNA.

Objective: This paper proposes a deep learning framework, DeepFusion-RBP, composed of three submodels. A sliding window is used to obtain sub-sequences, local features are obtained, and then the model is customized for each feature.

Methods: The main advantage of this research is using the sliding window method to cut the original sequence. While expanding the data set, this method avoids filling in too much meaningless data. Then, the model is customized for each feature to accurately perform RNA binding protein classification, with specific methods such as LSTM, Conv1D, Amino acid embedding, etc.

Results: To test whether the customized model can improve the final prediction effect, we used different combinations of sub-models and test sets of different lengths. The prediction ACC, F1-score and MCC of DeepFusion-RBP are 92.62%, 91.29%, and 84.96%, respectively, with cross-validation. At the same time, DeepFusion-RBP also showed excellent performance on three independent verification sets.

Conclusion: The results of 10-fold cross-validation and the independent verification set tests both suggested that the proposed models for different features and intercepting sub-sequences produce a certain improvement in the prediction effect of the model. The data supporting the findings of the article are available at

Keywords: RNA-binding protein, LSTM, deep learning, PSSM, protein sequence, word embedding.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2021
Published on: 18 June, 2021
Page: [1089 - 1100]
Pages: 12
DOI: 10.2174/1574893616666210618145121

Article Metrics

PDF: 126