Protein Secondary Structure Prediction Using Character Bi-gram Embedding and Bi-LSTM

Author(s): Ashish Kumar Sharma*, Rajeev Srivastava

Journal Name: Current Bioinformatics

Volume 16 , Issue 2 , 2021

Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Background: Protein secondary structure is vital to predicting the tertiary structure, which is essential in deciding protein function and drug designing. Therefore, there is a high requirement of computational methods to predict secondary structure from their primary sequence. Protein primary sequences represented as a linear combination of twenty amino acid characters and contain the contextual information for secondary structure prediction.

Objective and Methods: Protein secondary structure predicted from their primary sequences using a deep recurrent neural network. Protein secondary structure depends on local and long-range residues in primary sequences. In the proposed work, the local contextual information of amino acid residues captures with character n-gram. A dense embedding vector represents this local contextual information. Furthermore, the bidirectional long short-term memory (Bi-LSTM) model is used to capture the long-range contexts by extracting the past and future residues information in primary sequences.

Results: The proposed deep recurrent architecture is evaluated for its efficacy for datasets, namely ss.txt, RS126, and CASP9. The model shows the Q3 accuracies of 88.45%, 83.48%, and 86.69% for ss.txt, RS126, and CASP9, respectively. The performance of the proposed model is also compared with other state-of-the-art methods available in the literature.

Conclusion: After a comparative analysis, it was observed that the proposed model is performing better in comparison to state-of-art methods.

Keywords: Proteomics, protein secondary structure, amino acids sequence, character n-gram embedding, deep learning, bidirectional long short-term memory.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2021
Published on: 30 April, 2021
Page: [333 - 338]
Pages: 6
DOI: 10.2174/1574893615999200601122840
Price: $65

Article Metrics

PDF: 14