FermatS: A Novel Numerical Representation for Protein Sequence Comparison and DNA-binding Protein Identification

(E-pub Ahead of Print)

Author(s): Yanping Zhang*, Ya Gao, Jianwei Ni, Pengcheng Chen, Xiaosheng Wang

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research


Become EABM
Become Reviewer
Call for Editor

Abstract:

Aim and Objective: Given the rapidly increasing number of molecular biology data available, computational methods of low complexity are necessary to infer protein structure, function, and evolution.

Method: In the work, we proposed a novel mthod, FermatS, which based on the global position information and local position representation from the curve and normalized moments of inertia, respectively, to extract features information of protein sequences. Furthermore, we use the generated features by FermatS method to analyze the similarity/dissimilarity of nine ND5 proteins and establish the prediction model of DNA-binding proteins based on logistic regression with 5-fold crossvalidation.

Results: In the similarity/dissimilarity analysis of nine ND5 proteins, the results are consistent with evolutionary theory. Moreover, this method can effectively predict the DNA-binding proteins in realistic situations.

Conclusion: The findings demonstrate that the proposed method is effective for comparing, recognizing and predicting protein sequences. The main code and datasets can download from https://github.com/GaoYa1122/FermatS..

Keywords: Fermat spiral; Mass; Moment of inertia; Similarity/dissimilarity of species; Identification of DNA-binding proteins; Logistic regression

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Ahead of Print)
DOI: 10.2174/1386207323999201117111738
Price: $95

Article Metrics

PDF: 37