Prediction of Lysine Malonylation Sites Based on Pseudo Amino Acid Compositions (E-pub Ahead of Print)
Yuewu Liu ,
Protein malonylation is a newly discovered post-translational modification. Due to the limitations of experimental techniques, it is a great challenge to fast and accurately identify malonylation sites. We proposed a computational method to address the problem by extracting protein segments so that the lysine is at the center of each segment, which were further coded by the pseudo amino acid compositions. Then a support vector machine classifier trained by a training dataset was built to distinguish malonylation sites from non-malonylation ones. The leave-one-out test on the training dataset reached the accuracy of 0.7733, and the independent test on the testing dataset got 0.8889. Furthermore, the classifier also successfully identified 144 of 160 putative malonylation sites. Analyses on the differences between malonylation and non-malonylation segments implicated that lysine malonylation should follow a specific pattern, e.g. lysine with its neighbors being Glycine and Alanine might be more likely to be malonylated. Therefore, the proposed method is expected to be a promising tool to identify malonylation sites.
Keywords: Protein post-translational modification, lysine malonylation, support vector machine, pseudo amino acid composition, leave-one-out test.
Rights & PermissionsPrintExport