Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

DeepFusion-RBP: Using Deep Learning to Fuse Multiple Features to Identify RNA-binding Protein Sequences

Author(s): Xu Wang, Shunfang Wang*, Haoyi Fu, Xiaoli Ruan and Xianjun Tang

Volume 16, Issue 8, 2021

Published on: 18 June, 2021

Page: [1089 - 1100] Pages: 12

DOI: 10.2174/1574893616666210618145121

Price: $65

Abstract

Background: RNA-binding protein plays an important role in regulating splicing, RNA transport, and other post-transcriptional processes, identifying special RNA binding domains, and interacting with RNA.

Objective: This paper proposes a deep learning framework, DeepFusion-RBP, composed of three submodels. A sliding window is used to obtain sub-sequences, local features are obtained, and then the model is customized for each feature.

Methods: The main advantage of this research is using the sliding window method to cut the original sequence. While expanding the data set, this method avoids filling in too much meaningless data. Then, the model is customized for each feature to accurately perform RNA binding protein classification, with specific methods such as LSTM, Conv1D, Amino acid embedding, etc.

Results: To test whether the customized model can improve the final prediction effect, we used different combinations of sub-models and test sets of different lengths. The prediction ACC, F1-score and MCC of DeepFusion-RBP are 92.62%, 91.29%, and 84.96%, respectively, with cross-validation. At the same time, DeepFusion-RBP also showed excellent performance on three independent verification sets.

Conclusion: The results of 10-fold cross-validation and the independent verification set tests both suggested that the proposed models for different features and intercepting sub-sequences produce a certain improvement in the prediction effect of the model. The data supporting the findings of the article are available at https://github.com/mmwangxu/DeepFusion-RBP-tool.

Keywords: RNA-binding protein, LSTM, deep learning, PSSM, protein sequence, word embedding.

Graphical Abstract

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy