Title:Using Quadratic Discriminant Analysis to Predict Protein Secondary Structure Based on Chemical Shifts
VOLUME: 12 ISSUE: 1
Author(s):Li Z. Yuan, Feng Yong E, Zhao Wei and Kou G. Shan
Affiliation:College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China.
Keywords:Chemical shifts, statistical distribution, 10-fold cross validation, quadratic discriminant analysis, protein secondary structure.
Abstract:Background: Prediction of the protein three-dimensional structure is one of the most
important and hot topics in the field of bioinformatics. However, the prediction of the secondary
structure of a protein from its amino acid’s sequence is an important step towards the prediction of its
three-dimensional structure. Many approaches have been proposed for the prediction of protein
secondary structure and yielded better results. However, these algorithms were primarily based on the
features of the amino acid sequences.
Objective: In this paper, we introduced a new model for predicting the secondary structure of proteins.
Method: We used chemical shifts as a novel feature and combined with the quadratic discriminant
analysis method in predicting the secondary structure of proteins.
Results: Finally, the three-state overall prediction accuracy of 85.7% was obtained in the ten-fold crossvalidated
test, and the accuracies of alpha helices, beta stands and coil reached 95.2%, 83.7%, 77.8%
respectively. Moreover, to determine the importance of chemical shifts of six nuclei, we used the leave
one out feature and combined another five nuclei as features, the results showed that the chemical shift
of each nuclei play a different role in the prediction of protein secondary structure, and the maximum
overall accuracy reached 87.3% (Q
3) in using C C
α C
β H
α N as features.
Conclusion: Our model outperformed other state-of-the-art method in term of predictive accuracy. Our
results showed that the quadratic discriminant analysis method by using chemical shifts as features is
indeed a good choice for protein secondary structures.