Deep Convolutional Neural Networks for Predicting Hydroxyproline in Proteins

Author(s): HaiXia Long , Mi Wang* , HaiYan Fu .

Journal Name: Current Bioinformatics

Volume 12 , Issue 3 , 2017

Become EABM
Become Reviewer

Graphical Abstract:


Background: Protein hydroxyproline is one type of post translational modification (PTM). Because protein sequence contains many uncharacterized residues of P, the question that needs to be answered is: Which ones can be hydroxylated, and which ones cannot? The solution will not only give a deeper understanding of the hydroxylation mechanism but can also lead to drug development. The evergrowing demand for better handling of protein sequences in the post-genomic age presents new prediction challenges.

Objective: To address these challenges, developing computational methods to identify these sites quickly and accurately is our objective.

Method: We propose a new approach for predicting hydroxyproline using the deep learning model known as the convolutional neural network (CNN), and employed a pseudo amino acid composition (PseAAC) to identify these proteins and used the position-specific scoring matrix (PSSM) to represent samples as input to the CNN model.

Results and Conclusion: In our experiment, K-fold cross-validation testing on benchmark datasets further demonstrated the potential for CNN identification of protein hydroxyproline as well as other PTM type proteins.

Keywords: Protein hydroxyproline, deep learning, convolutional neural network, pseudo amino acid composition (PseAAC), position-specific scoring matrix (PSSM).

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2017
Page: [233 - 238]
Pages: 6
DOI: 10.2174/1574893612666170221152848
Price: $58

Article Metrics

PDF: 18