Generic placeholder image

Current Genomics


ISSN (Print): 1389-2029
ISSN (Online): 1875-5488

Research Article

Splice Junction Identification using Long Short-Term Memory Neural Networks

Author(s): Kevin Regan, Abolfazl Saghafi * and Zhijun Li

Volume 22 , Issue 5 , 2021

Published on: 03 December, 2021

Page: [384 - 390] Pages: 7

DOI: 10.2174/1389202922666211011143008

Price: $65


Background: Splice junctions are the key to move from pre-messenger RNA to mature messenger RNA in many multi-exon genes due to alternative splicing. Since the percentage of multi- exon genes that undergo alternative splicing is very high, identifying splice junctions is an attractive research topic with important implications.

Objective: The aim of this paper is to develop a deep learning model capable of identifying splice junctions in RNA sequences using 13,666 unique sequences of primate RNA.

Methods: A Long Short-Term Memory (LSTM) Neural Network model is developed that classifies a given sequence as EI (Exon-Intron splice), IE (Intron-Exon splice), or N (No splice). The model is trained with groups of trinucleotides and its performance is tested using validation and test data to prevent bias.

Results: Model performance was measured using accuracy and f-score in test data. The finalized model achieved an average accuracy of 91.34% with an average f-score of 91.36% over 50 runs.

Conclusion: Comparisons show a highly competitive model to recent Convolutional Neural Network structures. The proposed LSTM model achieves the highest accuracy and f-score among published alternative LSTM structures.

Keywords: Splice junction, deep learning, neural networks, LSTM, RNA-seq, classification.

Rights & Permissions Print Export Cite as
© 2022 Bentham Science Publishers | Privacy Policy