Prediction of Nitration Sites Based on FCBF Method and Stacking Ensemble Model

Min       Liu; Lu       Zhang; Xinyi       Qin; Tao       Huang; Ziwei       Xu; Guangzhong       Liu

Abstract

Background: Nitration is an important Post-Translational Modification (PTM) occurring on the tyrosine residues of proteins. The occurrence of protein tyrosine nitration under disease conditions is inevitable and represents a shift from the signal transducing physiological actions of - NO to oxidative and potentially pathogenic pathways. Abnormal protein nitration modification can lead to serious human diseases, including neurodegenerative diseases, acute respiratory distress, organ transplant rejection and lung cancer.

Objective: It is necessary and important to identify the nitration sites in protein sequences. Predicting which tyrosine residues in the protein sequence are nitrated and which are not is of great significance for the study of nitration mechanism and related diseases.

Methods: In this study, a prediction model of nitration sites based on the over-under sampling strategy and the FCBF method was proposed by stacking ensemble learning and fusing multiple features. Firstly, the protein sequence sample was encoded by 2701-dimensional fusion features (PseAAC, PSSM, AAIndex, CKSAAP, Disorder). Secondly, the ranked feature set was generated by the FCBF method according to the symmetric uncertainty metric. Thirdly, in the process of model training, the over- and under- sampling technique was used to tackle the imbalanced dataset. Finally, the Incremental Feature Selection (IFS) method was adopted to extract an optimal classifier based on 10-fold cross-validation.

Results and Conclusion: Results show that the model has significant performance advantages in indicators such as MCC, Recall and F1-score, no matter in what way the comparison was conducted with other classifiers on the independent test set, or made by cross-validation with single-type feature or with fusion-features on the training set. By integrating the FCBF feature ranking methods, over- and under- sampling technique and a stacking model composed of multiple base classifiers, an effective prediction model for nitration PTM sites was built, which can achieve a better recall rate when the ratio of positive and negative samples is highly imbalanced.

Keywords: PTM (post-translational modification), nitration site, FCBF (fast correlation-based filter), stacking model, overand under- sampling, Incremental Feature Selection (IFS).

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

21

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1570164618999210101222637	Print ISSN 1570-1646
Publisher Name Bentham Science Publisher	Online ISSN 1875-6247

Current Proteomics

Prediction of Nitration Sites Based on FCBF Method and Stacking Ensemble Model

Abstract

Graphical Abstract

Mass spectrometry data acquisition and analysis for proteomics

Peptides: State-of-Art and Commercialisation Hurdles

Current Proteomics

Prediction of Nitration Sites Based on FCBF Method and Stacking Ensemble Model

Abstract

Graphical Abstract

Call for Papers in Thematic Issues

Mass spectrometry data acquisition and analysis for proteomics

Peptides: State-of-Art and Commercialisation Hurdles

Related Journals

Related Books

Related Articles