Predicting Protein Structural Class for Low-Similarity Sequences via Novel Evolutionary Modes of PseAAC and Recursive Feature Elimination

Liang       Kong; Lingfu       Kong; Changwu       Wang; Rong       Jing; Lichao       Zhang

Abstract

Background and Objective: Protein structural class prediction is a first and key step in protein structure prediction and has become an active research area in biochemistry and bioinformatics. An important aspect for this prediction task is exploring good feature representation. Prior works have demonstrated the effectiveness of the PSI-BLAST profile based feature extraction methods especially for low-similarity protein sequences. However, the prediction accuracies still remain limited. This highlights the need for keeping on exploring the potential of evolutionary information.

Method: In this study, three novel sequence evolutionary modes of pseudo amino acid composition (PseAAC) are proposed and optimized by a two-stage feature selection process based on recursive feature elimination strategy. The selected top-ranking features are then fed into a linear kernel support vector machine classifier to predict the protein structure class. To evaluate the performance of the proposed method, jackknife tests are performed on three widely used low-similarity benchmark datasets (25PDB, 1189 and 640).

Results: With comprehensive comparison with the current state-of-the-art methods, the proposed method achieves superior performance. The overall accuracies on 25PDB, 1189 and 640 datasets are 96.2%, 97.9% and 99.5%, which are 1.9%, 1.5% and 2.3% higher than previous best-performing method.

Conclusion: The satisfactory prediction accuracies achieved by the proposed method are attributed to the specially designed sequence evolutionary modes of PseAAC and the effective feature selection strategy, which cover more discriminative sequence order information. It is anticipated that our method would be helpful in other prediction problems in protein research.

Keywords: Feature selection, position specific score matrix, protein structural class, recursive feature elimination, sequence similarity, support vector machine.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

18

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1570178614666170511165837	Print ISSN 1570-1786
Publisher Name Bentham Science Publisher	Online ISSN 1875-6255

Letters in Organic Chemistry

Predicting Protein Structural Class for Low-Similarity Sequences via Novel Evolutionary Modes of PseAAC and Recursive Feature Elimination

Abstract

Graphical Abstract

Related Journals

Related Books

Related Articles