A Useful Tool for the Identification of DNA-binding Proteins Using Graph Convolutional Network

(E-pub Ahead of Print)

Author(s): Dasheng Chen, Leyi Wei*

Journal Name: Current Proteomics

Become EABM
Become Reviewer


Background: Both DNAs and proteins are important components of living organisms. DNA-binding proteins are a kind of helicase, which is a protein specifically responsible for binding to DNA single stranded regions. It plays a key role in the function of various biomolecules. Although there are some prediction methods for the DNA-binding proteins sequences, the use of graph neural networks in this research is still limited.

Objective: In this article, using graph neural networks, we developed a novel predictor GCN-DBP for protein classification prediction.

Method: Each protein sequence is treated as a document in this study, and then document is segmented according to the concept of k-mer. This research aims to use document word relationships and word co-occurrence as a corpus to construct a text graph. Then, the predictor learns protein sequence information by two-layer graph convolutional networks.

Results: In order to compare the proposed method with other four existing methods, we have conducted more experiments. Finally, we tested GCN-DBP on the independent data set PDB2272. Its accuracy reached 64.17% and MCC reached 28.32%.

Conclusion: The results show that the proposed method is superior to the other four methods and will be a useful tool for protein classification.

Keywords: DNA-binding proteins; graph convolutional network (GCN); protein sequence; sequence classification; deep learning; k-mer spectrum.

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Ahead of Print)
DOI: 10.2174/1570164618999201210225354
Price: $95