Background: DNA-binding proteins are vital cellular components, and their identification is
important for the understanding of biological processes. Traditional methods for the prediction of protein
function are both time-consuming and expensive. With the development of bioinformatics, a large
amount of protein sequence information is available to researchers, necessitating the development of an
efficient predictor for identification of DNA-binding proteins based on the protein-sequence information.
Objective: To better utilize the protein sequence information and further improve the accuracy of
DNA-binding protein recognition, we designed a new predictor for identifying DNA-binding protein
based on a voting strategy.
Method: Here, we employed two feature extractions for DNA-binding protein identification, including
Physicochemical Distance Transformation (PDT), and PDT-profile. Then two predictors (iDNA-Prot-
PDT and iDNA-Prot-PDT-Profile) were established on the basis of these two feature extraction methods.
To further improve the quality of prediction, a voting strategy (iDNA-Prot-Vote) was adopted.
Results: The experimental results on benchmark dataset and independent dataset showed that our
methods outperformed other state-of-the-art methods.
Conclusion: These results indicate that the proposed methods are useful for DNA-binding protein
identification, which would promote the development of protein sequence analysis.