Protein Secondary Structure Prediction: A Review of Progress and Directions

Tomasz       Smolarczyk; Irena       Roterman-Konieczna; Katarzyna       Stapor
Abstract

Background: Over the last few decades, a search for the theory of protein folding has grown into a full-fledged research field at the intersection of biology, chemistry and informatics. Despite enormous effort, there are still open questions and challenges, like understanding the rules by which amino acid sequence determines protein secondary structure.
Objective: In this review, we depict the progress of the prediction methods over the years and identify sources of improvement.
Methods: The protein secondary structure prediction problem is described followed by the discussion on theoretical limitations, description of the commonly used data sets, features and a review of three generations of methods with the focus on the most recent advances. Additionally, methods with available online servers are assessed on the independent data set.
Results: The state-of-the-art methods are currently reaching almost 88% for 3-class prediction and 76.5% for an 8-class prediction.
Conclusion: This review summarizes recent advances and outlines further research directions.
Keywords: Protein secondary structure prediction, multiple sequence alignment, PSSM, HHblits, deep neural networks, machine learning, protein early-stage structure.
« Previous Next »
Graphical Abstract

[1] 
Anfinsen CB. Principles that govern the folding of protein chains. Science  1973; 181(4096): 223-30.
[http://dx.doi.org/10.1126/science.181.4096.223] [PMID: 4124164] 
[2] 
Rost B, Sander C, Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol  1994; 235(1): 13-26.
[http://dx.doi.org/10.1016/S0022-2836(05)80007-5] [PMID: 8289237] 
[3] 
Pauling L, Corey RB, Branson HR. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA  1951; 37(4): 205-11.
[http://dx.doi.org/10.1073/pnas.37.4.205] [PMID: 14816373] 
[4] 
Pauling L, Corey RB. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc Natl Acad Sci USA  1951; 37(11): 729-40.
[http://dx.doi.org/10.1073/pnas.37.11.729] [PMID: 16578412] 
[5] 
Yang Y, Gao J, Wang J, et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinform  2018; 19(3): 482-94.
[http://dx.doi.org/10.1093/bib/bbw129] [PMID: 28040746] 
[6] 
 UniProt Consortium T. UniProt: the universal protein knowledgebase. Nucleic Acids Res  2018; 46(5): 2699.
[http://dx.doi.org/10.1093/nar/gky092] [PMID: 29425356] 
[7] 
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res  2017; 45(D1): D158-69.
[http://dx.doi.org/10.1093/nar/gkw1099] [PMID: 27899622] 
[8] 
Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res  2000; 28(1): 235-42.
[http://dx.doi.org/10.1093/nar/28.1.235] [PMID: 10592235] 
[9] 
Qi Y, Oja M, Weston J, Noble W S. A Unified Multitask Architecture for Predicting Local Protein Properties PLoS One  2012; 7(3): e32235.
[10] 
Heffernan R, Paliwal K, Lyons J, et al. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep  2015; 5: 11476.
[http://dx.doi.org/10.1038/srep11476] [PMID: 26098304] 
[11] 
Zhang B, Li J, Lü Q. Prediction of 8-state protein secondary structures by a novel deep learning architecture. BMC Bioinformatics  2018; 19(1): 293.
[http://dx.doi.org/10.1186/s12859-018-2280-5] [PMID: 30075707] 
[12] 
Gromiha MM. Proteins protein bioinformatics from sequence to function. Protein  2010; 1-27.
[http://dx.doi.org/10.1016/B978-8-1312-2297-3.50001-1] 
[13] 
Jiang Q, Jin X, Lee S-J, Yao S. Protein secondary structure prediction: a survey of the state of the art. J Mol Graph Model  2017; 76: 379-402.
[http://dx.doi.org/10.1016/j.jmgm.2017.07.015] [PMID: 28763690] 
[14] 
Chen J, Chaudhari NS. Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction. Soft Comput  2006; 10(4): 315-24.
[http://dx.doi.org/10.1007/s00500-005-0489-5] 
[15] 
Huang Y-F, Chen S-Y. Extracting physicochemical features to predict protein secondary structure. ScientificWorldJournal  2013; 2013: 347106
[http://dx.doi.org/10.1155/2013/347106] [PMID: 23766688] 
[16] 
Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers  1983; 22(12): 2577-637.
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333] 
[17] 
Cuff JA, Barton GJ. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins  1999; 34(4): 508-19.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990301)34:4<508:AID-PROT10>3.0.CO;2-4] [PMID: 10081963] 
[18] 
Frishman D, Argos P. Knowledge-based protein secondary structure assignment. Proteins  1995; 23(4): 566-79.
[http://dx.doi.org/10.1002/prot.340230412] [PMID: 8749853] 
[19] 
Levitt M, Chothia C. Structural patterns in globular proteins. Nature  1976; 261(5561): 552-8.
[http://dx.doi.org/10.1038/261552a0] [PMID: 934293] 
[20] 
Bryliński M, Konieczny L, Czerwonko P, Jurkowski W, Roterman I. Early-stage folding in proteins (in silico) sequence-to-structure relation. J Biomed Biotechnol  2005; 2005(2): 65-79.
[http://dx.doi.org/10.1155/JBB.2005.65] [PMID: 16046811] 
[21] 
Kalinowska B, Alejster P, Sałapa K, Baster Z, Roterman I. Hypothetical in silico model of the early-stage intermediate in protein folding. J Mol Model  2013; 19(10): 4259-69.
[http://dx.doi.org/10.1007/s00894-013-1909-6] [PMID: 23812949] 
[22] 
Roterman I. Modelling the optimal simulation path in the peptide chain folding--studies based on geometry of alanine heptapeptide. J Theor Biol  1995; 177(3): 283-8.
[http://dx.doi.org/10.1006/jtbi.1995.0245] [PMID: 8746328] 
[23] 
Vapnik VN. Statistical Learning Theory 1998.
[24] 
Jain AK, Duin RP, Mao J. Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell  2000; 22(1): 4-37.
[http://dx.doi.org/10.1109/34.824819] 
[25] 
Kohavi R.  A study of cross-validation and bootstrap for accuracy estimation and model selection Proceedings of the 14th international joint conference on Artificial intelligence 2: 1137-43.
[26] 
Bettella F, Rasinski D, Knapp EW. Protein secondary structure prediction with SPARROW. J Chem Inf Model  2012; 52(2): 545-56.
[http://dx.doi.org/10.1021/ci200321u] [PMID: 22224407] 
[27] 
Rost B. Rising Accuracy of Protein Secondary Structure Prediction Protein structure determination, analysis, and modeling for drug discovery  2003; 207-49.
[28] 
Rost B. Review: protein secondary structure prediction continues to rise. J Struct Biol  2001; 134(2-3): 204-18.
[http://dx.doi.org/10.1006/jsbi.2001.4336] [PMID: 11551180] 
[29] 
Zhang W, Dunker AK, Zhou Y. Assessing secondary structure assignment of protein structures by using pairwise sequence-alignment benchmarks. Proteins  2008; 71(1): 61-7.
[http://dx.doi.org/10.1002/prot.21654] [PMID: 17932927] 
[30] 
Chou PY, Fasman GD. Prediction of protein conformation. Biochemistry  1974; 13(2): 222-45.
[http://dx.doi.org/10.1021/bi00699a002] [PMID: 4358940] 
[31] 
Holley LH, Karplus M. Protein secondary structure prediction with a neural network. Proc Natl Acad Sci USA  1989; 86(1): 152-6.
[http://dx.doi.org/10.1073/pnas.86.1.152] [PMID: 2911565] 
[32] 
Bohr H, Bohr J, Brunak S, et al. Protein secondary structure and homology by neural networks. The α-helices in rhodopsin. FEBS Lett  1988; 241(1-2): 223-8.
[http://dx.doi.org/10.1016/0014-5793(88)81066-4] [PMID: 3197832] 
[33] 
Muggleton S, King RD, Sternberg MJ. Protein secondary structure prediction using logic-based machine learning. Protein Eng  1992; 5(7): 647-57.
[http://dx.doi.org/10.1093/protein/5.7.647] [PMID: 1480619] 
[34] 
Garnier J, Osguthorpe DJ, Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol  1978; 120(1): 97-120.
[http://dx.doi.org/10.1016/0022-2836(78)90297-8] [PMID: 642007] 
[35] 
Kabat EA, Wu TT. The influence of nearest-neighbor amino acids on the conformation of the middle amino acid in proteins: comparison of predicted and experimental determination of -sheets in concanavalin A. Proc Natl Acad Sci USA  1973; 70(5): 1473-7.
[http://dx.doi.org/10.1073/pnas.70.5.1473] [PMID: 4514316] 
[36] 
Lim VI. Algorithms for prediction of α-helical and β-structural regions in globular proteins. J Mol Biol  1974; 88(4): 873-94.
[http://dx.doi.org/10.1016/0022-2836(74)90405-7] [PMID: 4427384] 
[37] 
Rost B, Sander C. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci USA  1993; 90(16): 7558-62.
[http://dx.doi.org/10.1073/pnas.90.16.7558] [PMID: 8356056] 
[38] 
Hua S, Sun Z. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol  2001; 308(2): 397-407.
[http://dx.doi.org/10.1006/jmbi.2001.4580] [PMID: 11327775] 
[39] 
Ward JJ, McGuffin LJ, Buxton BF, Jones DT. Secondary structure prediction with support vector machines. Bioinformatics  2003; 19(13): 1650-5.
[http://dx.doi.org/10.1093/bioinformatics/btg223] [PMID: 12967961] 
[40] 
Yao X-Q, Zhu H, She Z-S. A dynamic Bayesian network approach to protein secondary structure prediction. BMC Bioinformatics  2008; 9(9): 49.
[http://dx.doi.org/10.1186/1471-2105-9-49] [PMID: 18218144] 
[41] 
Aydin Z, Altunbasak Y, Borodovsky M. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models. BMC Bioinformatics  2006; 7(7): 178.
[http://dx.doi.org/10.1186/1471-2105-7-178] [PMID: 16571137] 
[42] 
Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol  1999; 292(2): 195-202.
[http://dx.doi.org/10.1006/jmbi.1999.3091] [PMID: 10493868] 
[43] 
Dor O, Zhou Y. Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training. Proteins  2007; 66(4): 838-45.
[http://dx.doi.org/10.1002/prot.21298] [PMID: 17177203] 
[44] 
Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields scientific Reports  2016; 6.
[45] 
Rost B, Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol  1993; 232(2): 584-99.
[http://dx.doi.org/10.1006/jmbi.1993.1413] [PMID: 8345525] 
[46] 
Kim S-Y, Sim J, Lee J. Fuzzy k-Nearest Neighbor Method for Protein Secondary Structure Prediction and Its Parallel Implementation Computational Intelligence and Bioinformatics  2006; 444-53.
[http://dx.doi.org/10.1007/11816102_48] 
[47] 
Li Z, Yu Y.  Protein secondary structure prediction using cascaded convolutional and recurrent neural networks Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 2016; pp. 2560-7.
[48] 
Cuff JA, Barton GJ. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins  2000; 40(3): 502-11.
[http://dx.doi.org/10.1002/1097-0134(20000815)40:3<502:AID-PROT170>3.0.CO;2-Q] [PMID: 10861942] 
[49] 
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res  1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694] 
[50] 
Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci USA  1987; 84(13): 4355-8.
[http://dx.doi.org/10.1073/pnas.84.13.4355] [PMID: 3474607] 
[51] 
Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA  1992; 89(22): 10915-9.
[http://dx.doi.org/10.1073/pnas.89.22.10915] [PMID: 1438297] 
[52] 
Li D, Li T. Cong P, Xiong W, Sun J. A novel structural position-specific scoring matrix for the prediction of protein secondary structures. Bioinformatics  2012; 28(1): 32-9.
[http://dx.doi.org/10.1093/bioinformatics/btr611] [PMID: 22065541] 
[53] 
Jeong JC, Lin X, Chen X-W. On position-specific scoring matrix for protein function prediction. IEEE/ACM Trans Comput Biol Bioinformatics  2011; 8(2): 308-15.
[http://dx.doi.org/10.1109/TCBB.2010.93] [PMID: 20855926] 
[54] 
Maetschke S, Towsey MW, Boden M.  Blomap: an encoding of amino acids which improves signal peptide cleavage site prediction Third Asia Pacific Bioinformatics Conference Singapore 
[55] 
Dayhoff M, Schwartz R, Orcutt B. A Model of Evolutionary Change in Proteins Atlas of Protein Sequence and Structure.  1978.
[56] 
Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods  2011; 9(2): 173-5.
[http://dx.doi.org/10.1038/nmeth.1818] [PMID: 22198341] 
[57] 
Yan R, Xu D, Yang J, Walker S, Zhang Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep  2013; 3(2619): 2619.
[http://dx.doi.org/10.1038/srep02619] [PMID: 24018415] 
[58] 
Sharma R, Kumar S, Tsunoda T, Patil A, Sharma A. Predicting MoRFs in protein sequences using HMM profiles. BMC Bioinformatics  2016; 17(Suppl. 19): 504.
[http://dx.doi.org/10.1186/s12859-016-1375-0] [PMID: 28155710] 
[59] 
Fauchère J-L, Charton M, Kier LB, Verloop A, Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology.  Int J Pept Protein Res  1988; 32(4): 269-78.
[http://dx.doi.org/10.1111/j.1399-3011.1988.tb01261.x] [PMID: 3209351] 
[60] 
Zvelebil MJ, Barton GJ, Taylor WR, Sternberg MJ. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol  1987; 195(4): 957-61.
[http://dx.doi.org/10.1016/0022-2836(87)90501-8] [PMID: 3656439] 
[61] 
Kawashima S, Kanehisa M. AAindex: amino acid index database. Nucleic Acids Res  2000; 28(1): 374.
[http://dx.doi.org/10.1093/nar/28.1.374] [PMID: 10592278] 
[62] 
Meiler J, Müller M, Zeidler A, Schmäschke F. Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks Molecular modeling annual  2001; 7(9): 360-9.
[63] 
Heffernan R, Yang Y, Paliwal K, Zhou Y. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics  2017; 33(18): 2842-9.
[http://dx.doi.org/10.1093/bioinformatics/btx218] [PMID: 28430949] 
[64] 
Pok G, Jin CH, Ryu KH.  Correlation of amino acid physicochemical properties with protein secondary structure conformation. International Conference on BioMedical Engineering and Informatics. 2008; pp. 117-21.
[65] 
Qu W, Sui H, Yang B, Qian W. Improving protein secondary structure prediction using a multi-modal BP method. Comput Biol Med  2011; 41(10): 946-59.
[http://dx.doi.org/10.1016/j.compbiomed.2011.08.005] [PMID: 21880310] 
[66] 
Carugo O. Amino acid composition and protein dimension. Protein Sci  2008; 17(12): 2187-91.
[http://dx.doi.org/10.1110/ps.037762.108] [PMID: 18780815] 
[67] 
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins  2001; 43(3): 246-55.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174] 
[68] 
Lin H, Li QZ. Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components. J Comput Chem  2007; 28(9): 1463-6.
[http://dx.doi.org/10.1002/jcc.20554] [PMID: 17330882] 
[69] 
Chou K-C, Cai YD. Predicting protein quaternary structure by pseudo amino acid composition. Proteins  2003; 53(2): 282-9.
[http://dx.doi.org/10.1002/prot.10500] [PMID: 14517979] 
[70] 
Georgiou DN, Karakasidis TE, Megaritis AC. A Short Survey on Genetic Sequences, Chou's Pseudo Amino Acid Composition and its Combination with Fuzzy Set Theory. Open Bioinform J  2013; 1(Suppl-1, M4): 41-8.
[71] 
Georgiou DN, Karakasidis TE, Nieto JJ, Torres A. Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition. J Theor Biol  2009; 257(1): 17-26.
[http://dx.doi.org/10.1016/j.jtbi.2008.11.003] [PMID: 19056401] 
[72] 
Zhang S. Accurate prediction of protein structural classes by incorporating PSSS and PSSM into Chou’s general PseAAC. Chemom Intell Lab Syst  2015; 142: 28-35.
[http://dx.doi.org/10.1016/j.chemolab.2015.01.004] 
[73] 
Chen Y-K, Li K-B. Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou’s pseudo amino acid composition. J Theor Biol  2013; 318: 1-12.
[http://dx.doi.org/10.1016/j.jtbi.2012.10.033] [PMID: 23137835] 
[74] 
Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol  2011; 269(1): 64-9.
[http://dx.doi.org/10.1016/j.jtbi.2010.10.019] [PMID: 20969879] 
[75] 
Liao B, Jiang J-B, Zeng Q-G, Zhu W. Predicting apoptosis protein subcellular location with PseAAC by incorporating tripeptide composition. Protein Pept Lett  2011; 18(11): 1086-92.
[http://dx.doi.org/10.2174/092986611797200931] [PMID: 21605055] 
[76] 
Bellman R. Dynamic programming 1957.
[77] 
Feng P-M, Chen W, Lin H, Chou K-C. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem  2013; 442(1): 118-25.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733] 
[78] 
Chen W, Feng P, Liu T, Jin D. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab  2019; 20(3): 224-8.
[http://dx.doi.org/10.2174/1389200219666181031105916] [PMID: 30378494] 
[79] 
Fang C, Shang Y, Xu D. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction. Proteins  2018; 86(5): 592-8.
[http://dx.doi.org/10.1002/prot.25487] [PMID: 29492997] 
[80] 
Zhou J, Troyanskaya OG. Deep supervised and convolutional generative
stochastic network for protein secondary structure prediction
proceedings of the 31st International Conference on International
Conference on Machine Learning Beijing, China. 2014.
[81] 
Busia A, Jaitly N.  Next-step conditioned deep convolutional neural
networks improve protein secondary structure prediction conference
on intelligent systems for molecular biology & European conference
on computational biology 2017.
[82] 
Feng P-M, Ding H, Chen W, Lin H. Naïve bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med  2013; 2013: 530696
[http://dx.doi.org/10.1155/2013/530696] 
[83] 
Dash M, Liu H. Feature selection for classification. Intell Data Anal  1997; 1(1-4): 131-56.
[http://dx.doi.org/10.1016/S1088-467X(97)00008-5] 
[84] 
Hall MA, Smith LA.  Feature Subset Selection: A Correlation
Based Filter Approach International Conference on Neural Information
Processing and Intelligent Information Systems Berlin.
[85] 
Duda RO, Hart PE, Stork DG.  Pattern Classification. 2. 2001.
[86] 
Ding H, Feng P-M, Chen W, Lin H. Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. Mol Biosyst  2014; 10(8): 2229-35.
[http://dx.doi.org/10.1039/C4MB00316K] [PMID: 24931825] 
[87] 
Zou Q, Zeng J, Cao L, Ji R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing  2016; 173(2): 346-54.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123] 
[88] 
Feng C-Q, Zhang Z-Y, Zhu X-J, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics  2019; 35(9): 1469-77.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625] 
[89] 
Pudil P, Novovičová J, Kittler J. Floating search methods in feature selection. Pattern Recognit Lett  1994; 15: 1119-25.
[http://dx.doi.org/10.1016/0167-8655(94)90127-9] 
[90] 
Feo TA, Resende MGC. Greedy randomized adaptive search procedures. J Glob Optim  1995; 6(2): 109-33.
[http://dx.doi.org/10.1007/BF01096763] 
[91] 
Blum B, Jordan M, Kim DE, Das R, Bradley P, Baker D.  Feature Selection Methods for Improving Protein Structure Prediction with
Rosetta Proceedings of the Twenty-First Annual Conference on
Neural Information Processing Systems Vancouver.
[92] 
Melo JC, Cavalcanti GDC, Guimarães KS.  PCA feature extraction
for protein structure prediction Proceedings of the International
Joint Conference on Neural Networks Portland, OR, USA.
[93] 
Guyon I, Elisseeff A.  An Introduction to Feature Extraction 2006.
[http://dx.doi.org/10.1007/978-3-540-35488-8_1] 
[94] 
Lee J. Measures for the assessment of fuzzy predictions of protein secondary structure. Proteins  2006; 65(2): 453-62.
[http://dx.doi.org/10.1002/prot.21164] [PMID: 16948155] 
[95] 
Zemla A, Venclovas C, Fidelis K, Rost B. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins  1999; 34(2): 220-3.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-K] [PMID: 10022357] 
[96] 
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta  1975; 405(2): 442-51.
[http://dx.doi.org/10.1016/0005-2795(75)90109-9] [PMID: 1180967] 
[97] 
Fox NK, Brenner SE, Chandonia J-M. SCOPe: Structural classification of proteins--extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res  2014; 42(Database issue): D304-9.
[http://dx.doi.org/10.1093/nar/gkt1240] [PMID: 24304899] 
[98] 
Carugo O. Predicting residue solvent accessibility from protein sequence by considering the sequence environment. Protein Eng  2000; 13(9): 607-9.
[http://dx.doi.org/10.1093/protein/13.9.607] [PMID: 11054454] 
[99] 
Hobohm U, Sander C. Enlarged representative set of protein structures. Protein Sci  1994; 3(3): 522-4.
[http://dx.doi.org/10.1002/pro.5560030317] [PMID: 8019422] 
[100] 
Moult J, Hubbard T, Fidelis K, Pedersen JT. Critical assessment of methods of protein structure prediction (CASP): round III. Proteins  1999; 37(Suppl 3): 2-6.
[http://dx.doi.org/10.1002/(SICI)1097-0134(1999)37:3+<2::AIDPROT2>3.0.CO;2-2] [PMID: 10526346] 
[101] 
Moult J, Fidelis K, Kryshtafovych A, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)--round IX. Proteins  2011; 79(Suppl. 10): 1-5.
[http://dx.doi.org/10.1002/prot.23200] [PMID: 21997831] 
[102] 
Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y. SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J Comput Chem  2012; 33(3): 259-67.
[http://dx.doi.org/10.1002/jcc.21968] [PMID: 22045506] 
[103] 
Yaseen A, Li Y. Context-based features enhance protein secondary structure prediction accuracy. J Chem Inf Model  2014; 54(3): 992-1002.
[http://dx.doi.org/10.1021/ci400647u] [PMID: 24571803] 
[104] 
Yaseen A, Li Y. Template-based C8-SCORPION: a protein 8-state secondary structure prediction method using structural information and context-based features. BMC Bioinformatics  2014; 15(8): S3.
[http://dx.doi.org/10.1186/1471-2105-15-S8-S3] [PMID: 25080939] 
[105] 
Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, Tramontano A. Assessment of the assessment: Evaluation of the model quality estimates in CASP10. Proteins  2014; 82: 112-26.
[106] 
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP) -round x. Proteins  2014; 82(Suppl. 2): 1-6.
[107] 
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction: Progress and new directions in round XI. Proteins  2016; 84(1): 4-14.
[http://dx.doi.org/10.1002/prot.25064] [PMID: 27171127] 
[108] 
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins  2018; 86(Suppl. 1): 7-15.
[http://dx.doi.org/10.1002/prot.25415] [PMID: 29082672] 
[109] 
Wang G, Dunbrack RL Jr. PISCES: recent improvements to a PDB sequence culling server. Nucleic Acids Res  2005; 33(Web Server issue): W94-8.
[http://dx.doi.org/10.1093/nar/gki402] 
[110] 
Wang G, Dunbrack RL Jr. PISCES: a protein sequence culling server. Bioinformatics  2003; 19(12): 1589-91.
[http://dx.doi.org/10.1093/bioinformatics/btg224] [PMID: 12912846] 
[111] 
Wang Z, Zhao F, Peng J, Xu J. Protein 8-class secondary structure prediction using conditional neural fields. Proteomics  2011; 11(19): 3786-92.
[http://dx.doi.org/10.1002/pmic.201100196] [PMID: 21805636] 
[112] 
Pollastri G, Przybylski D, Rost B, Baldi P. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins  2002; 47(2): 228-35.
[http://dx.doi.org/10.1002/prot.10082] [PMID: 11933069] 
[113] 
Ahmad S, Gromiha MM, Sarai A. Real value prediction of solvent accessibility from amino acid sequence. Proteins  2003; 50(4): 629-35.
[http://dx.doi.org/10.1002/prot.10328] [PMID: 12577269] 
[114] 
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics  2006; 22(13): 1658-9.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699] 
[115] 
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics  2010; 26(5): 680-2.
[http://dx.doi.org/10.1093/bioinformatics/btq003] [PMID: 20053844] 
[116] 
Torrisi M, Kaleel M, Pollastri G. Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes. bioRxiv 
[http://dx.doi.org/10.1101/289033] 
[117] 
Mirabello C, Pollastri G. Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility. Bioinformatics  2013; 29(16): 2056-8.
[http://dx.doi.org/10.1093/bioinformatics/btt344] [PMID: 23772049] 
[118] 
Drozdetskiy A, Cole C, Procter J, Barton GJ. JPred4: a protein secondary structure prediction server. Nucleic Acids Res  2015; 43(W1): W389-94
[http://dx.doi.org/10.1093/nar/gkv332] [PMID: 25883141] 
[119] 
Buchan DWA, Jones DT. The PSIPRED Protein Analysis Workbench: 20 years on. Nucleic Acids Res  2019; 47(W1): W402-7
[http://dx.doi.org/10.1093/nar/gkz297] [PMID: 31251384] 
[120] 
Heffernan R, Dehzangi A, Lyons J, et al. Highly accurate sequence-based prediction of half-sphere exposures of amino acid residues in proteins. Bioinformatics  2016; 32(6): 843-9.
[http://dx.doi.org/10.1093/bioinformatics/btv665] [PMID: 26568622] 
[121] 
Duan M, Huang M, Ma C, Li L, Zhou Y. Position-specific residue preference features around the ends of helices and strands and a novel strategy for the prediction of secondary structures. Protein Sci  2008; 17(9): 1505-12.
[http://dx.doi.org/10.1110/ps.035691.108] [PMID: 18519808] 
[122] 
Tan YH, Huang H, Kihara D. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences. Proteins  2006; 64(3): 587-600.
[http://dx.doi.org/10.1002/prot.21020] [PMID: 16799934] 
[123] 
Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics  2014; 30(18): 2592-7.
[http://dx.doi.org/10.1093/bioinformatics/btu352] [PMID: 24860169] 
[124] 
Cheng J, Randall AZ, Sweredoski MJ, Baldi P. SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res  2005; 33(Suppl. 2): W72-6.
[125] 
Källberg M, Wang H, Wang S, et al. Template-based protein structure modeling using the RaptorX web server. Nat Protoc  2012; 7(8): 1511-22.
[http://dx.doi.org/10.1038/nprot.2012.085] [PMID: 22814390] 
[126] 
Göbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins  1994; 18(4): 309-17.
[http://dx.doi.org/10.1002/prot.340180402] [PMID: 8208723] 
[127] 
Ji S, Oruç T, Mead L, et al. DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure. PLoS One  2019; 14(1): e0205214
[http://dx.doi.org/10.1371/journal.pone.0205214] [PMID: 30620738] 
[128] 
Di Lena P, Nagata K, Baldi P. Deep architectures for protein contact map prediction. Bioinformatics  2012; 28(19): 2449-57.
[http://dx.doi.org/10.1093/bioinformatics/bts475] [PMID: 22847931] 
[129] 
Domingos P. A few useful things to know about machine learning. Commun ACM  2012; 55(10): 78-87.
[http://dx.doi.org/10.1145/2347736.2347755] 
Rights & Permissions Print Cite
Article Metrics
71
16
Journal Information
For Authors
For Editors
For Reviewers
Explore Articles
Open Access
Open Access Articles
For Visitors
DOI https://dx.doi.org/10.2174/1574893614666191017104639	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X
Current Bioinformatics

Protein Secondary Structure Prediction: A Review of Progress and Directions

Abstract

Graphical Abstract

Related Journals

Related Books