Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm

Shao   Peng   Wang; Qing      Zhang; Jing      Lu; Yu-Dong      Cai

Abstract

Background: The post-translational modification of tyrosine nitration is an important covalently substituted process in many biochemical processes that are closely related to several human diseases.

Objective: Therefore, the correct recognition of nitration sites is useful for diseases diagnosis and the design of effective treatments. However, traditional experimental techniques and methods for the identification of nitration sites are time-consuming, labor-intensive and expensive. Alternatively, effective computational methods can be designed to tackle this problem.

Method: In this study, we proposed a computational workflow to identify and analyze nitrated tyrosine residues in proteins. Specifically, each nitrated tyrosine was represented by features derived from a segment of an amino acid sequence of the protein containing the nitrated tyrosine site. A reliable feature selection method, minimum redundancy maximum relevance, was adopted to analyze these features, and incremental feature selection and a type of support vector machine, SMO (sequential minimal optimization), were employed to extract core features and build an optimal prediction classifier.

Results: 223 features were extracted and used to build the optimal prediction classifier, with which the Matthew's correlation coefficient (MCC) of the training set was 0.717. The nitration sites in the testing set were extracted from the UniProt database based on the sequence similarity technique and were all denoted as positive samples. The sensitivity of the optimal classifier was 0.950 for the testing set. The results demonstrate the effectiveness and importance of optimal features and the classifier for the recognition of nitration sites. In addition, three other methods, the nearest neighbor algorithm (NNA), Dagging and random forest (RF) methods were also applied to the training and testing set, and the results were compared with those of the SMO.

Conclusion: 61 core features of the 223 total features were analyzed, and this analysis revealed the essential residue types and conserved sites proximal to the central tyrosine residue.

Keywords: Post-translational modification, tyrosine nitration prediction, minimum redundancy maximum relevance, support vector machine, incremental feature selection.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

77

3

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893611666160608075753	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm

Abstract

Graphical Abstract

Related Journals

Related Books