Protein Secondary Structure Prediction: A Review of Progress and Directions

Author(s): Tomasz Smolarczyk, Irena Roterman-Konieczna, Katarzyna Stapor*.

Journal Name: Current Bioinformatics

Volume 15 , Issue 2 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Over the last few decades, a search for the theory of protein folding has grown into a full-fledged research field at the intersection of biology, chemistry and informatics. Despite enormous effort, there are still open questions and challenges, like understanding the rules by which amino acid sequence determines protein secondary structure.

Objective: In this review, we depict the progress of the prediction methods over the years and identify sources of improvement.

Methods: The protein secondary structure prediction problem is described followed by the discussion on theoretical limitations, description of the commonly used data sets, features and a review of three generations of methods with the focus on the most recent advances. Additionally, methods with available online servers are assessed on the independent data set.

Results: The state-of-the-art methods are currently reaching almost 88% for 3-class prediction and 76.5% for an 8-class prediction.

Conclusion: This review summarizes recent advances and outlines further research directions.

Keywords: Protein secondary structure prediction, multiple sequence alignment, PSSM, HHblits, deep neural networks, machine learning, protein early-stage structure.

[1]
Anfinsen CB. Principles that govern the folding of protein chains. Science 1973; 181(4096): 223-30.
[http://dx.doi.org/10.1126/science.181.4096.223] [PMID: 4124164]
[2]
Rost B, Sander C, Schneider R. Redefining the goals of protein secondary structure prediction. J Mol Biol 1994; 235(1): 13-26.
[http://dx.doi.org/10.1016/S0022-2836(05)80007-5] [PMID: 8289237]
[3]
Pauling L, Corey RB, Branson HR. The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci USA 1951; 37(4): 205-11.
[http://dx.doi.org/10.1073/pnas.37.4.205] [PMID: 14816373]
[4]
Pauling L, Corey RB. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc Natl Acad Sci USA 1951; 37(11): 729-40.
[http://dx.doi.org/10.1073/pnas.37.11.729] [PMID: 16578412]
[5]
Yang Y, Gao J, Wang J, et al. Sixty-five years of the long march in protein secondary structure prediction: the final stretch? Brief Bioinform 2018; 19(3): 482-94.
[http://dx.doi.org/10.1093/bib/bbw129] [PMID: 28040746]
[6]
UniProt Consortium T. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2018; 46(5): 2699.
[http://dx.doi.org/10.1093/nar/gky092] [PMID: 29425356]
[7]
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res 2017; 45(D1): D158-69.
[http://dx.doi.org/10.1093/nar/gkw1099] [PMID: 27899622]
[8]
Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res 2000; 28(1): 235-42.
[http://dx.doi.org/10.1093/nar/28.1.235] [PMID: 10592235]
[9]
Qi Y, Oja M, Weston J, Noble W S. A Unified Multitask Architecture for Predicting Local Protein Properties PLoS One 2012; 7(3): e32235.
[10]
Heffernan R, Paliwal K, Lyons J, et al. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep 2015; 5: 11476.
[http://dx.doi.org/10.1038/srep11476] [PMID: 26098304]
[11]
Zhang B, Li J, Lü Q. Prediction of 8-state protein secondary structures by a novel deep learning architecture. BMC Bioinformatics 2018; 19(1): 293.
[http://dx.doi.org/10.1186/s12859-018-2280-5] [PMID: 30075707]
[12]
Gromiha MM. Proteins protein bioinformatics from sequence to function. Protein 2010; 1-27.
[http://dx.doi.org/10.1016/B978-8-1312-2297-3.50001-1]
[13]
Jiang Q, Jin X, Lee S-J, Yao S. Protein secondary structure prediction: a survey of the state of the art. J Mol Graph Model 2017; 76: 379-402.
[http://dx.doi.org/10.1016/j.jmgm.2017.07.015] [PMID: 28763690]
[14]
Chen J, Chaudhari NS. Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction. Soft Comput 2006; 10(4): 315-24.
[http://dx.doi.org/10.1007/s00500-005-0489-5]
[15]
Huang Y-F, Chen S-Y. Extracting physicochemical features to predict protein secondary structure. ScientificWorldJournal 2013; 2013: 347106
[http://dx.doi.org/10.1155/2013/347106] [PMID: 23766688]
[16]
Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983; 22(12): 2577-637.
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[17]
Cuff JA, Barton GJ. Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 1999; 34(4): 508-19.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990301)34:4<508:AID-PROT10>3.0.CO;2-4] [PMID: 10081963]
[18]
Frishman D, Argos P. Knowledge-based protein secondary structure assignment. Proteins 1995; 23(4): 566-79.
[http://dx.doi.org/10.1002/prot.340230412] [PMID: 8749853]
[19]
Levitt M, Chothia C. Structural patterns in globular proteins. Nature 1976; 261(5561): 552-8.
[http://dx.doi.org/10.1038/261552a0] [PMID: 934293]
[20]
Bryliński M, Konieczny L, Czerwonko P, Jurkowski W, Roterman I. Early-stage folding in proteins (in silico) sequence-to-structure relation. J Biomed Biotechnol 2005; 2005(2): 65-79.
[http://dx.doi.org/10.1155/JBB.2005.65] [PMID: 16046811]
[21]
Kalinowska B, Alejster P, Sałapa K, Baster Z, Roterman I. Hypothetical in silico model of the early-stage intermediate in protein folding. J Mol Model 2013; 19(10): 4259-69.
[http://dx.doi.org/10.1007/s00894-013-1909-6] [PMID: 23812949]
[22]
Roterman I. Modelling the optimal simulation path in the peptide chain folding--studies based on geometry of alanine heptapeptide. J Theor Biol 1995; 177(3): 283-8.
[http://dx.doi.org/10.1006/jtbi.1995.0245] [PMID: 8746328]
[23]
Vapnik VN. Statistical Learning Theory 1998.
[24]
Jain AK, Duin RP, Mao J. Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 2000; 22(1): 4-37.
[http://dx.doi.org/10.1109/34.824819]
[25]
Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection Proceedings of the 14th international joint conference on Artificial intelligence 2: 1137-43.
[26]
Bettella F, Rasinski D, Knapp EW. Protein secondary structure prediction with SPARROW. J Chem Inf Model 2012; 52(2): 545-56.
[http://dx.doi.org/10.1021/ci200321u] [PMID: 22224407]
[27]
Rost B. Rising Accuracy of Protein Secondary Structure Prediction Protein structure determination, analysis, and modeling for drug discovery 2003; 207-49.
[28]
Rost B. Review: protein secondary structure prediction continues to rise. J Struct Biol 2001; 134(2-3): 204-18.
[http://dx.doi.org/10.1006/jsbi.2001.4336] [PMID: 11551180]
[29]
Zhang W, Dunker AK, Zhou Y. Assessing secondary structure assignment of protein structures by using pairwise sequence-alignment benchmarks. Proteins 2008; 71(1): 61-7.
[http://dx.doi.org/10.1002/prot.21654] [PMID: 17932927]
[30]
Chou PY, Fasman GD. Prediction of protein conformation. Biochemistry 1974; 13(2): 222-45.
[http://dx.doi.org/10.1021/bi00699a002] [PMID: 4358940]
[31]
Holley LH, Karplus M. Protein secondary structure prediction with a neural network. Proc Natl Acad Sci USA 1989; 86(1): 152-6.
[http://dx.doi.org/10.1073/pnas.86.1.152] [PMID: 2911565]
[32]
Bohr H, Bohr J, Brunak S, et al. Protein secondary structure and homology by neural networks. The α-helices in rhodopsin. FEBS Lett 1988; 241(1-2): 223-8.
[http://dx.doi.org/10.1016/0014-5793(88)81066-4] [PMID: 3197832]
[33]
Muggleton S, King RD, Sternberg MJ. Protein secondary structure prediction using logic-based machine learning. Protein Eng 1992; 5(7): 647-57.
[http://dx.doi.org/10.1093/protein/5.7.647] [PMID: 1480619]
[34]
Garnier J, Osguthorpe DJ, Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 1978; 120(1): 97-120.
[http://dx.doi.org/10.1016/0022-2836(78)90297-8] [PMID: 642007]
[35]
Kabat EA, Wu TT. The influence of nearest-neighbor amino acids on the conformation of the middle amino acid in proteins: comparison of predicted and experimental determination of -sheets in concanavalin A. Proc Natl Acad Sci USA 1973; 70(5): 1473-7.
[http://dx.doi.org/10.1073/pnas.70.5.1473] [PMID: 4514316]
[36]
Lim VI. Algorithms for prediction of α-helical and β-structural regions in globular proteins. J Mol Biol 1974; 88(4): 873-94.
[http://dx.doi.org/10.1016/0022-2836(74)90405-7] [PMID: 4427384]
[37]
Rost B, Sander C. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc Natl Acad Sci USA 1993; 90(16): 7558-62.
[http://dx.doi.org/10.1073/pnas.90.16.7558] [PMID: 8356056]
[38]
Hua S, Sun Z. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol 2001; 308(2): 397-407.
[http://dx.doi.org/10.1006/jmbi.2001.4580] [PMID: 11327775]
[39]
Ward JJ, McGuffin LJ, Buxton BF, Jones DT. Secondary structure prediction with support vector machines. Bioinformatics 2003; 19(13): 1650-5.
[http://dx.doi.org/10.1093/bioinformatics/btg223] [PMID: 12967961]
[40]
Yao X-Q, Zhu H, She Z-S. A dynamic Bayesian network approach to protein secondary structure prediction. BMC Bioinformatics 2008; 9(9): 49.
[http://dx.doi.org/10.1186/1471-2105-9-49] [PMID: 18218144]
[41]
Aydin Z, Altunbasak Y, Borodovsky M. Protein secondary structure prediction for a single-sequence using hidden semi-Markov models. BMC Bioinformatics 2006; 7(7): 178.
[http://dx.doi.org/10.1186/1471-2105-7-178] [PMID: 16571137]
[42]
Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 1999; 292(2): 195-202.
[http://dx.doi.org/10.1006/jmbi.1999.3091] [PMID: 10493868]
[43]
Dor O, Zhou Y. Achieving 80% ten-fold cross-validated accuracy for secondary structure prediction by large-scale training. Proteins 2007; 66(4): 838-45.
[http://dx.doi.org/10.1002/prot.21298] [PMID: 17177203]
[44]
Wang S, Peng J, Ma J, Xu J. Protein secondary structure prediction using deep convolutional neural fields scientific Reports 2016; 6.
[45]
Rost B, Sander C. Prediction of protein secondary structure at better than 70% accuracy. J Mol Biol 1993; 232(2): 584-99.
[http://dx.doi.org/10.1006/jmbi.1993.1413] [PMID: 8345525]
[46]
Kim S-Y, Sim J, Lee J. Fuzzy k-Nearest Neighbor Method for Protein Secondary Structure Prediction and Its Parallel Implementation Computational Intelligence and Bioinformatics 2006; 444-53.
[http://dx.doi.org/10.1007/11816102_48]
[47]
Li Z, Yu Y. Protein secondary structure prediction using cascaded convolutional and recurrent neural networks Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 2016; pp. 2560-7.
[48]
Cuff JA, Barton GJ. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 2000; 40(3): 502-11.
[http://dx.doi.org/10.1002/1097-0134(20000815)40:3<502:AID-PROT170>3.0.CO;2-Q] [PMID: 10861942]
[49]
Altschul SF, Madden TL, Schäffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25(17): 3389-402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[50]
Gribskov M, McLachlan AD, Eisenberg D. Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci USA 1987; 84(13): 4355-8.
[http://dx.doi.org/10.1073/pnas.84.13.4355] [PMID: 3474607]
[51]
Henikoff S, Henikoff JG. Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 1992; 89(22): 10915-9.
[http://dx.doi.org/10.1073/pnas.89.22.10915] [PMID: 1438297]
[52]
Li D, Li T. Cong P, Xiong W, Sun J. A novel structural position-specific scoring matrix for the prediction of protein secondary structures. Bioinformatics 2012; 28(1): 32-9.
[http://dx.doi.org/10.1093/bioinformatics/btr611] [PMID: 22065541]
[53]
Jeong JC, Lin X, Chen X-W. On position-specific scoring matrix for protein function prediction. IEEE/ACM Trans Comput Biol Bioinformatics 2011; 8(2): 308-15.
[http://dx.doi.org/10.1109/TCBB.2010.93] [PMID: 20855926]
[54]
Maetschke S, Towsey MW, Boden M. Blomap: an encoding of amino acids which improves signal peptide cleavage site prediction Third Asia Pacific Bioinformatics Conference Singapore
[55]
Dayhoff M, Schwartz R, Orcutt B. A Model of Evolutionary Change in Proteins Atlas of Protein Sequence and Structure. 1978.
[56]
Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2011; 9(2): 173-5.
[http://dx.doi.org/10.1038/nmeth.1818] [PMID: 22198341]
[57]
Yan R, Xu D, Yang J, Walker S, Zhang Y. A comparative assessment and analysis of 20 representative sequence alignment methods for protein structure prediction. Sci Rep 2013; 3(2619): 2619.
[http://dx.doi.org/10.1038/srep02619] [PMID: 24018415]
[58]
Sharma R, Kumar S, Tsunoda T, Patil A, Sharma A. Predicting MoRFs in protein sequences using HMM profiles. BMC Bioinformatics 2016; 17(Suppl. 19): 504.
[http://dx.doi.org/10.1186/s12859-016-1375-0] [PMID: 28155710]
[59]
Fauchère J-L, Charton M, Kier LB, Verloop A, Pliska V. Amino acid side chain parameters for correlation studies in biology and pharmacology. Int J Pept Protein Res 1988; 32(4): 269-78.
[http://dx.doi.org/10.1111/j.1399-3011.1988.tb01261.x] [PMID: 3209351]
[60]
Zvelebil MJ, Barton GJ, Taylor WR, Sternberg MJ. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J Mol Biol 1987; 195(4): 957-61.
[http://dx.doi.org/10.1016/0022-2836(87)90501-8] [PMID: 3656439]
[61]
Kawashima S, Kanehisa M. AAindex: amino acid index database. Nucleic Acids Res 2000; 28(1): 374.
[http://dx.doi.org/10.1093/nar/28.1.374] [PMID: 10592278]
[62]
Meiler J, Müller M, Zeidler A, Schmäschke F. Generation and evaluation of dimension-reduced amino acid parameter representations by artificial neural networks Molecular modeling annual 2001; 7(9): 360-9.
[63]
Heffernan R, Yang Y, Paliwal K, Zhou Y. Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 2017; 33(18): 2842-9.
[http://dx.doi.org/10.1093/bioinformatics/btx218] [PMID: 28430949]
[64]
Pok G, Jin CH, Ryu KH. Correlation of amino acid physicochemical properties with protein secondary structure conformation. International Conference on BioMedical Engineering and Informatics. 2008; pp. 117-21.
[65]
Qu W, Sui H, Yang B, Qian W. Improving protein secondary structure prediction using a multi-modal BP method. Comput Biol Med 2011; 41(10): 946-59.
[http://dx.doi.org/10.1016/j.compbiomed.2011.08.005] [PMID: 21880310]
[66]
Carugo O. Amino acid composition and protein dimension. Protein Sci 2008; 17(12): 2187-91.
[http://dx.doi.org/10.1110/ps.037762.108] [PMID: 18780815]
[67]
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 2001; 43(3): 246-55.
[http://dx.doi.org/10.1002/prot.1035] [PMID: 11288174]
[68]
Lin H, Li QZ. Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components. J Comput Chem 2007; 28(9): 1463-6.
[http://dx.doi.org/10.1002/jcc.20554] [PMID: 17330882]
[69]
Chou K-C, Cai YD. Predicting protein quaternary structure by pseudo amino acid composition. Proteins 2003; 53(2): 282-9.
[http://dx.doi.org/10.1002/prot.10500] [PMID: 14517979]
[70]
Georgiou DN, Karakasidis TE, Megaritis AC. A Short Survey on Genetic Sequences, Chou's Pseudo Amino Acid Composition and its Combination with Fuzzy Set Theory. Open Bioinform J 2013; 1(Suppl-1, M4): 41-8.
[71]
Georgiou DN, Karakasidis TE, Nieto JJ, Torres A. Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou’s pseudo amino acid composition. J Theor Biol 2009; 257(1): 17-26.
[http://dx.doi.org/10.1016/j.jtbi.2008.11.003] [PMID: 19056401]
[72]
Zhang S. Accurate prediction of protein structural classes by incorporating PSSS and PSSM into Chou’s general PseAAC. Chemom Intell Lab Syst 2015; 142: 28-35.
[http://dx.doi.org/10.1016/j.chemolab.2015.01.004]
[73]
Chen Y-K, Li K-B. Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou’s pseudo amino acid composition. J Theor Biol 2013; 318: 1-12.
[http://dx.doi.org/10.1016/j.jtbi.2012.10.033] [PMID: 23137835]
[74]
Lin H, Ding H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J Theor Biol 2011; 269(1): 64-9.
[http://dx.doi.org/10.1016/j.jtbi.2010.10.019] [PMID: 20969879]
[75]
Liao B, Jiang J-B, Zeng Q-G, Zhu W. Predicting apoptosis protein subcellular location with PseAAC by incorporating tripeptide composition. Protein Pept Lett 2011; 18(11): 1086-92.
[http://dx.doi.org/10.2174/092986611797200931] [PMID: 21605055]
[76]
Bellman R. Dynamic programming 1957.
[77]
Feng P-M, Chen W, Lin H, Chou K-C. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442(1): 118-25.
[http://dx.doi.org/10.1016/j.ab.2013.05.024] [PMID: 23756733]
[78]
Chen W, Feng P, Liu T, Jin D. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab 2019; 20(3): 224-8.
[http://dx.doi.org/10.2174/1389200219666181031105916] [PMID: 30378494]
[79]
Fang C, Shang Y, Xu D. MUFOLD-SS: New deep inception-inside-inception networks for protein secondary structure prediction. Proteins 2018; 86(5): 592-8.
[http://dx.doi.org/10.1002/prot.25487] [PMID: 29492997]
[80]
Zhou J, Troyanskaya OG. Deep supervised and convolutional generative stochastic network for protein secondary structure prediction proceedings of the 31st International Conference on International Conference on Machine Learning Beijing, China. 2014.
[81]
Busia A, Jaitly N. Next-step conditioned deep convolutional neural networks improve protein secondary structure prediction conference on intelligent systems for molecular biology & European conference on computational biology 2017.
[82]
Feng P-M, Ding H, Chen W, Lin H. Naïve bayes classifier with feature selection to identify phage virion proteins. Comput Math Methods Med 2013; 2013: 530696
[http://dx.doi.org/10.1155/2013/530696]
[83]
Dash M, Liu H. Feature selection for classification. Intell Data Anal 1997; 1(1-4): 131-56.
[http://dx.doi.org/10.1016/S1088-467X(97)00008-5]
[84]
Hall MA, Smith LA. Feature Subset Selection: A Correlation Based Filter Approach International Conference on Neural Information Processing and Intelligent Information Systems Berlin.
[85]
Duda RO, Hart PE, Stork DG. Pattern Classification. 2. 2001.
[86]
Ding H, Feng P-M, Chen W, Lin H. Identification of bacteriophage virion proteins by the ANOVA feature selection and analysis. Mol Biosyst 2014; 10(8): 2229-35.
[http://dx.doi.org/10.1039/C4MB00316K] [PMID: 24931825]
[87]
Zou Q, Zeng J, Cao L, Ji R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 2016; 173(2): 346-54.
[http://dx.doi.org/10.1016/j.neucom.2014.12.123]
[88]
Feng C-Q, Zhang Z-Y, Zhu X-J, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics 2019; 35(9): 1469-77.
[http://dx.doi.org/10.1093/bioinformatics/bty827] [PMID: 30247625]
[89]
Pudil P, Novovičová J, Kittler J. Floating search methods in feature selection. Pattern Recognit Lett 1994; 15: 1119-25.
[http://dx.doi.org/10.1016/0167-8655(94)90127-9]
[90]
Feo TA, Resende MGC. Greedy randomized adaptive search procedures. J Glob Optim 1995; 6(2): 109-33.
[http://dx.doi.org/10.1007/BF01096763]
[91]
Blum B, Jordan M, Kim DE, Das R, Bradley P, Baker D. Feature Selection Methods for Improving Protein Structure Prediction with Rosetta Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems Vancouver.
[92]
Melo JC, Cavalcanti GDC, Guimarães KS. PCA feature extraction for protein structure prediction Proceedings of the International Joint Conference on Neural Networks Portland, OR, USA.
[93]
Guyon I, Elisseeff A. An Introduction to Feature Extraction 2006.
[http://dx.doi.org/10.1007/978-3-540-35488-8_1]
[94]
Lee J. Measures for the assessment of fuzzy predictions of protein secondary structure. Proteins 2006; 65(2): 453-62.
[http://dx.doi.org/10.1002/prot.21164] [PMID: 16948155]
[95]
Zemla A, Venclovas C, Fidelis K, Rost B. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 1999; 34(2): 220-3.
[http://dx.doi.org/10.1002/(SICI)1097-0134(19990201)34:2<220::AID-PROT7>3.0.CO;2-K] [PMID: 10022357]
[96]
Matthews BW. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta 1975; 405(2): 442-51.
[http://dx.doi.org/10.1016/0005-2795(75)90109-9] [PMID: 1180967]
[97]
Fox NK, Brenner SE, Chandonia J-M. SCOPe: Structural classification of proteins--extended, integrating SCOP and ASTRAL data and classification of new structures. Nucleic Acids Res 2014; 42(Database issue): D304-9.
[http://dx.doi.org/10.1093/nar/gkt1240] [PMID: 24304899]
[98]
Carugo O. Predicting residue solvent accessibility from protein sequence by considering the sequence environment. Protein Eng 2000; 13(9): 607-9.
[http://dx.doi.org/10.1093/protein/13.9.607] [PMID: 11054454]
[99]
Hobohm U, Sander C. Enlarged representative set of protein structures. Protein Sci 1994; 3(3): 522-4.
[http://dx.doi.org/10.1002/pro.5560030317] [PMID: 8019422]
[100]
Moult J, Hubbard T, Fidelis K, Pedersen JT. Critical assessment of methods of protein structure prediction (CASP): round III. Proteins 1999; 37(Suppl 3): 2-6.
[http://dx.doi.org/10.1002/(SICI)1097-0134(1999)37:3+<2::AIDPROT2>3.0.CO;2-2] [PMID: 10526346]
[101]
Moult J, Fidelis K, Kryshtafovych A, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)--round IX. Proteins 2011; 79(Suppl. 10): 1-5.
[http://dx.doi.org/10.1002/prot.23200] [PMID: 21997831]
[102]
Faraggi E, Zhang T, Yang Y, Kurgan L, Zhou Y. SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles. J Comput Chem 2012; 33(3): 259-67.
[http://dx.doi.org/10.1002/jcc.21968] [PMID: 22045506]
[103]
Yaseen A, Li Y. Context-based features enhance protein secondary structure prediction accuracy. J Chem Inf Model 2014; 54(3): 992-1002.
[http://dx.doi.org/10.1021/ci400647u] [PMID: 24571803]
[104]
Yaseen A, Li Y. Template-based C8-SCORPION: a protein 8-state secondary structure prediction method using structural information and context-based features. BMC Bioinformatics 2014; 15(8): S3.
[http://dx.doi.org/10.1186/1471-2105-15-S8-S3] [PMID: 25080939]
[105]
Kryshtafovych A, Barbato A, Fidelis K, Monastyrskyy B, Schwede T, Tramontano A. Assessment of the assessment: Evaluation of the model quality estimates in CASP10. Proteins 2014; 82: 112-26.
[106]
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP) -round x. Proteins 2014; 82(Suppl. 2): 1-6.
[107]
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction: Progress and new directions in round XI. Proteins 2016; 84(1): 4-14.
[http://dx.doi.org/10.1002/prot.25064] [PMID: 27171127]
[108]
Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)-Round XII. Proteins 2018; 86(Suppl. 1): 7-15.
[http://dx.doi.org/10.1002/prot.25415] [PMID: 29082672]
[109]
Wang G, Dunbrack RL Jr. PISCES: recent improvements to a PDB sequence culling server. Nucleic Acids Res 2005; 33(Web Server issue): W94-8.
[http://dx.doi.org/10.1093/nar/gki402]
[110]
Wang G, Dunbrack RL Jr. PISCES: a protein sequence culling server. Bioinformatics 2003; 19(12): 1589-91.
[http://dx.doi.org/10.1093/bioinformatics/btg224] [PMID: 12912846]
[111]
Wang Z, Zhao F, Peng J, Xu J. Protein 8-class secondary structure prediction using conditional neural fields. Proteomics 2011; 11(19): 3786-92.
[http://dx.doi.org/10.1002/pmic.201100196] [PMID: 21805636]
[112]
Pollastri G, Przybylski D, Rost B, Baldi P. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 2002; 47(2): 228-35.
[http://dx.doi.org/10.1002/prot.10082] [PMID: 11933069]
[113]
Ahmad S, Gromiha MM, Sarai A. Real value prediction of solvent accessibility from amino acid sequence. Proteins 2003; 50(4): 629-35.
[http://dx.doi.org/10.1002/prot.10328] [PMID: 12577269]
[114]
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006; 22(13): 1658-9.
[http://dx.doi.org/10.1093/bioinformatics/btl158] [PMID: 16731699]
[115]
Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 2010; 26(5): 680-2.
[http://dx.doi.org/10.1093/bioinformatics/btq003] [PMID: 20053844]
[116]
Torrisi M, Kaleel M, Pollastri G. Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes. bioRxiv
[http://dx.doi.org/10.1101/289033]
[117]
Mirabello C, Pollastri G. Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility. Bioinformatics 2013; 29(16): 2056-8.
[http://dx.doi.org/10.1093/bioinformatics/btt344] [PMID: 23772049]
[118]
Drozdetskiy A, Cole C, Procter J, Barton GJ. JPred4: a protein secondary structure prediction server. Nucleic Acids Res 2015; 43(W1): W389-94
[http://dx.doi.org/10.1093/nar/gkv332] [PMID: 25883141]
[119]
Buchan DWA, Jones DT. The PSIPRED Protein Analysis Workbench: 20 years on. Nucleic Acids Res 2019; 47(W1): W402-7
[http://dx.doi.org/10.1093/nar/gkz297] [PMID: 31251384]
[120]
Heffernan R, Dehzangi A, Lyons J, et al. Highly accurate sequence-based prediction of half-sphere exposures of amino acid residues in proteins. Bioinformatics 2016; 32(6): 843-9.
[http://dx.doi.org/10.1093/bioinformatics/btv665] [PMID: 26568622]
[121]
Duan M, Huang M, Ma C, Li L, Zhou Y. Position-specific residue preference features around the ends of helices and strands and a novel strategy for the prediction of secondary structures. Protein Sci 2008; 17(9): 1505-12.
[http://dx.doi.org/10.1110/ps.035691.108] [PMID: 18519808]
[122]
Tan YH, Huang H, Kihara D. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences. Proteins 2006; 64(3): 587-600.
[http://dx.doi.org/10.1002/prot.21020] [PMID: 16799934]
[123]
Magnan CN, Baldi P. SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity. Bioinformatics 2014; 30(18): 2592-7.
[http://dx.doi.org/10.1093/bioinformatics/btu352] [PMID: 24860169]
[124]
Cheng J, Randall AZ, Sweredoski MJ, Baldi P. SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res 2005; 33(Suppl. 2): W72-6.
[125]
Källberg M, Wang H, Wang S, et al. Template-based protein structure modeling using the RaptorX web server. Nat Protoc 2012; 7(8): 1511-22.
[http://dx.doi.org/10.1038/nprot.2012.085] [PMID: 22814390]
[126]
Göbel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins 1994; 18(4): 309-17.
[http://dx.doi.org/10.1002/prot.340180402] [PMID: 8208723]
[127]
Ji S, Oruç T, Mead L, et al. DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure. PLoS One 2019; 14(1): e0205214
[http://dx.doi.org/10.1371/journal.pone.0205214] [PMID: 30620738]
[128]
Di Lena P, Nagata K, Baldi P. Deep architectures for protein contact map prediction. Bioinformatics 2012; 28(19): 2449-57.
[http://dx.doi.org/10.1093/bioinformatics/bts475] [PMID: 22847931]
[129]
Domingos P. A few useful things to know about machine learning. Commun ACM 2012; 55(10): 78-87.
[http://dx.doi.org/10.1145/2347736.2347755]


Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 15
ISSUE: 2
Year: 2020
Page: [90 - 107]
Pages: 18
DOI: 10.2174/1574893614666191017104639

Article Metrics

PDF: 16
HTML: 1