Generic placeholder image

Combinatorial Chemistry & High Throughput Screening

Editor-in-Chief

ISSN (Print): 1386-2073
ISSN (Online): 1875-5402

Research Article

DBP-PSSM: Combination of Evolutionary Profiles with the XGBoost Algorithm to Improve the Identification of DNA-binding Proteins

Author(s): Yanping Zhang*, Pengcheng Chen, Ya Gao, Jianwei Ni and Xiaosheng Wang

Volume 25 , Issue 1 , 2022

Published on: 24 November, 2020

Page: [3 - 12] Pages: 10

DOI: 10.2174/1386207323999201124203531

Abstract

Background and Objective: DNA-binding proteins play important roles in a variety of biological processes, such as gene transcription and regulation, DNA replication and repair, DNA recombination and packaging, and the formation of chromatin and ribosomes. Therefore, it is urgent to develop a computational method to improve the recognition efficiency of DNA-binding proteins.

Methods: We proposed a novel method, DBP-PSSM, which constructed the features from amino acid composition and evolutionary information of protein sequences. The maximum relevance, minimum redundancy (mRMR) was employed to select the optimal features for establishing the XGBoost classifier, therefore, the novel model of prediction DNA-binding proteins, DBP-PSSM, was established with 5-fold cross-validation on the training dataset.

Results: DBP-PSSM achieved an accuracy of 81.18% and MCC of 0.657 in a test dataset, which outperformed the many existing methods. These results demonstrated that our method can effectively predict DNA-binding proteins.

Conclusion: The data and source code are provided at https://github.com/784221489/DNA-binding.

Keywords: DNA-binding proteins, Local_DPP, PSSM400, sliding window and smoothing window, mRMR, XGBoost.

Graphical Abstract

© 2022 Bentham Science Publishers | Privacy Policy