An Effective Cumulative Torsion Angles Model for Prediction of Protein Folding Rates

Author(s): Yanru Li, Ying Zhang*, Jun Lv*.

Journal Name: Protein & Peptide Letters

Volume 27 , Issue 4 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Protein folding rate is mainly determined by the size of the conformational space to search, which in turn is dictated by factors such as size, structure and amino-acid sequence in a protein. It is important to integrate these factors effectively to form a more precisely description of conformation space. But there is no general paradigm to answer this question except some intuitions and empirical rules. Therefore, at the present stage, predictions of the folding rate can be improved through finding new factors, and some insights are given to the above question.

Objective: Its purpose is to propose a new parameter that can describe the size of the conformational space to improve the prediction accuracy of protein folding rate.

Methods: Based on the optimal set of amino acids in a protein, an effective cumulative backbone torsion angles (CBTAeff) was proposed to describe the size of the conformational space. Linear regression model was used to predict protein folding rate with CBTAeff as a parameter. The degree of correlation was described by the coefficient of determination and the mean absolute error MAE between the predicted folding rates and experimental observations.

Results: It achieved a high correlation (with the coefficient of determination of 0.70 and MAE of 1.88) between the logarithm of folding rates and the (CBTAeff)0.5 with experimental over 112 twoand multi-state folding proteins.

Conclusion: The remarkable performance of our simplistic model demonstrates that CBTA based on optimal set was the major determinants of the conformation space of natural proteins.

Keywords: Folding rates, conformational space, optimal set of amino acids, cumulative backbone torsion angles, linear regression model, correlation.

[1]
Anfinsen, C.B.; Haber, E.; Sela, M.; White, F.H. Jr. The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl. Acad. Sci. USA, 1961, 47(9), 1309-1314.
[http://dx.doi.org/10.1073/pnas.47.9.1309] [PMID: 13683522]
[2]
Anfinsen, C.B. Principles that govern the folding of protein chains. Science, 1973, 181(4096), 223-230.
[http://dx.doi.org/10.1126/science.181.4096.223] [PMID: 4124164]
[3]
Levinthal, C. Are there pathways for protein folding? J. Chim. Phys., 1968, 65(62), 44-45.
[http://dx.doi.org/10.1051/jcp/1968650044]
[4]
Qiu, L.; Pabit, S.A.; Roitberg, A.E.; Hagen, S.J. Smaller and faster: The 20-residue Trp-cage protein folds in 4 micros. J. Am. Chem. Soc., 2002, 124(44), 12952-12953.
[http://dx.doi.org/10.1021/ja0279141] [PMID: 12405814]
[5]
Kubelka, J.; Hofrichter, J.; Eaton, W.A. The protein folding ‘speed limit’. Curr. Opin. Struct. Biol., 2004, 14(1), 76-88.
[http://dx.doi.org/10.1016/j.sbi.2004.01.013] [PMID: 15102453]
[6]
Mayor, U.; Johnson, C.M.; Daggett, V.; Fersht, A.R. Protein folding and unfolding in microseconds to nanoseconds by experiment and simulation. Proc. Natl. Acad. Sci. USA, 2000, 97(25), 13518-13522.
[http://dx.doi.org/10.1073/pnas.250473497] [PMID: 11087839]
[7]
Reader, J.S.; Van Nuland, N.A.; Thompson, G.S.; Ferguson, S.J.; Dobson, C.M.; Radford, S.E. A partially folded intermediate species of the beta-sheet protein apo-pseudoazurin is trapped during proline-limited folding. Protein Sci., 2001, 10(6), 1216-1224.
[http://dx.doi.org/10.1110/ps.52801] [PMID: 11369860]
[8]
Muñoz, V.; Eaton, W.A. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc. Natl. Acad. Sci. USA, 1999, 96(20), 11311-11316.
[http://dx.doi.org/10.1073/pnas.96.20.11311] [PMID: 10500173]
[9]
Wolynes, P.G.; Onuchic, J.N.; Thirumalai, D. Navigating the folding routes. Science, 1995, 267(5204), 1619-1620.
[http://dx.doi.org/10.1126/science.7886447] [PMID: 7886447]
[10]
De Sancho, D.; Muñoz, V. Integrated prediction of protein folding and unfolding rates from only size and structural class. Phys. Chem. Chem. Phys., 2011, 13(38), 17030-17043.
[http://dx.doi.org/10.1039/c1cp20402e] [PMID: 21670826]
[11]
Gutin, A.M.; Abkevich, V.I.; Shakhnovich, E.I. Chain length scaling of protein folding time. Phys. Rev. Lett., 1996, 77(27), 5433-5436.
[http://dx.doi.org/10.1103/PhysRevLett.77.5433] [PMID: 10062802]
[12]
Finkelstein, A.V.; Badretdinov, A.Y. Rate of protein folding near the point of thermodynamic equilibrium between the coil and the most stable chain fold. Fold. Des., 1997, 2(2), 115-121.
[http://dx.doi.org/10.1016/S1359-0278(97)00016-3] [PMID: 9135984]
[13]
Galzitskaya, O.V.; Garbuzynskiy, S.O.; Ivankov, D.N.; Finkelstein, A.V. Chain length is the main determinant of the folding rate for proteins with three-state folding kinetics. Proteins, 2003, 51(2), 162-166.
[http://dx.doi.org/10.1002/prot.10343] [PMID: 12660985]
[14]
Finkelstein, A.V.; Bogatyreva, N.S.; Garbuzynskiy, S.O. Restrictions to protein folding determined by the protein size. FEBS Lett., 2013, 587(13), 1884-1890.
[http://dx.doi.org/10.1016/j.febslet.2013.04.041] [PMID: 23684724]
[15]
Lane, T.J.; Pande, V.S. Inferring the rate-length law of protein folding. PLoS One, 2013, 8(12) e78606
[http://dx.doi.org/10.1371/journal.pone.0078606] [PMID: 24339865]
[16]
Ivankov, D.N.; Finkelstein, A.V. Prediction of protein folding rates from the amino acid sequence-predicted secondary structure. Proc. Natl. Acad. Sci. USA, 2004, 101(24), 8942-8944.
[http://dx.doi.org/10.1073/pnas.0402659101] [PMID: 15184682]
[17]
Chang, L.; Wang, J.; Wang, W. Composition-based effective chain length for prediction of protein folding rates. Phys. Rev. E Stat. Nonlin. Soft Matter Phys., 2010, 82(5 Pt 1) 051930
[http://dx.doi.org/10.1103/PhysRevE.82.051930] [PMID: 21230523]
[18]
Plaxco, K.W.; Simons, K.T.; Baker, D. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol., 1998, 277(4), 985-994.
[http://dx.doi.org/10.1006/jmbi.1998.1645] [PMID: 9545386]
[19]
Ivankov, D.N.; Garbuzynskiy, S.O.; Alm, E.; Plaxco, K.W.; Baker, D.; Finkelstein, A.V. Contact order revisited: Influence of protein size on the folding rate. Protein Sci., 2003, 12(9), 2057-2062.
[http://dx.doi.org/10.1110/ps.0302503] [PMID: 12931003]
[20]
Mirny, L.; Shakhnovich, E. Protein folding theory: From lattice to all-atom models. Annu. Rev. Biophys. Biomol. Struct., 2001, 30(1), 361-396.
[http://dx.doi.org/10.1146/annurev.biophys.30.1.361] [PMID: 11340064]
[21]
Gromiha, M.M.; Selvaraj, S. Comparison between long-range interactions and contact order in determining the folding rate of two-state proteins: Application of long-range order to folding rate prediction. J. Mol. Biol., 2001, 310(1), 27-32.
[http://dx.doi.org/10.1006/jmbi.2001.4775] [PMID: 11419934]
[22]
Zhou, H.; Zhou, Y. Folding rate prediction using total contact distance. Biophys. J., 2002, 82(1 Pt 1), 458-463.
[http://dx.doi.org/10.1016/S0006-3495(02)75410-6] [PMID: 11751332]
[23]
Makarov, D.E.; Plaxco, K.W. The topomer search model: A simple, quantitative theory of two-state protein folding kinetics. Protein Sci., 2003, 12(1), 17-26.
[http://dx.doi.org/10.1110/ps.0220003] [PMID: 12493824]
[24]
Huang, S.; Huang, J.T. Inter-residue interaction is a determinant of protein folding kinetics. J. Theor. Biol., 2013, 317(1), 224-228.
[http://dx.doi.org/10.1016/j.jtbi.2012.10.003] [PMID: 23063779]
[25]
Rustad, M.; Ghosh, K. Why and how does native topology dictate the folding speed of a protein? J. Chem. Phys., 2012, 137(20) 205104
[http://dx.doi.org/10.1063/1.4767567] [PMID: 23206039]
[26]
Shen, H.B.; Song, J.N.; Chou, K.C. Prediction of protein folding rates from primary sequence by fusing multiple sequential features. J. Biomed. Sci. Eng., 2009, 2(3), 136-143.
[http://dx.doi.org/10.4236/jbise.2009.23024]
[27]
Huang, J.T.; Tian, J. Amino acid sequence predicts folding rate for middle-size two-state proteins. Proteins, 2006, 63(3), 551-554.
[http://dx.doi.org/10.1002/prot.20911] [PMID: 16477599]
[28]
Gromiha, M.M. A statistical model for predicting protein folding rates from amino acid sequence with structural class information. J. Chem. Inf. Model., 2005, 45(2), 494-501.
[http://dx.doi.org/10.1021/ci049757q] [PMID: 15807515]
[29]
Gromiha, M.M.; Thangakani, A.M.; Selvaraj, S. FOLD-RATE: Prediction of protein folding rates from amino acid sequence. Nucleic Acids Res., 2006, 34(Web Server issue), W70-W74.
[http://dx.doi.org/10.1093/nar/gkl043]
[30]
Cheng, X.; Xiao, X.; Wu, Z.C.; Wang, P.; Lin, W.Z. Swfoldrate: Predicting protein folding rates from amino acid sequence with sliding window method. Proteins, 2013, 81(1), 140-148.
[http://dx.doi.org/10.1002/prot.24171] [PMID: 22933332]
[31]
Liu, L.; Ma, M.; Cui, J. A novel model-based on FCM-LM algorithm for prediction of protein folding rate. J. Bioinform. Comput. Biol., 2017, 15(4) 1750012
[http://dx.doi.org/10.1142/S0219720017500123] [PMID: 28513252]
[32]
Chang, C.C.; Tey, B.T.; Song, J.; Ramanan, R.N. Towards more accurate prediction of protein folding rates: A review of the existing Web-based bioinformatics approaches. Brief. Bioinform., 2015, 16(2), 314-324.
[http://dx.doi.org/10.1093/bib/bbu007] [PMID: 24621527]
[33]
Corrales, M.; Cuscó, P.; Usmanova, D.R.; Chen, H.C.; Bogatyreva, N.S.; Filion, G.J.; Ivankov, D.N. Machine learning: How much does it tell about protein folding rates? PLoS One, 2015, 10(11) e0143166
[http://dx.doi.org/10.1371/journal.pone.0143166] [PMID: 26606303]
[34]
Dill, K.A.; MacCallum, J.L. The protein-folding problem, 50 years on. Science, 2012, 338(6110), 1042-1046.
[http://dx.doi.org/10.1126/science.1219021] [PMID: 23180855]
[35]
Liang, H.; Wang, L.; Zhang, Y.; Ding, C.; Lv, J. Prediction of protein folding rates from the amino acid sequence-predicted backbone torsion angles. Lett. Org. Chem., 2017, 14(9), 648-654.
[http://dx.doi.org/10.2174/1570178614666170608130848]
[36]
Manavalan, B.; Kuwajima, K.; Lee, J. PFDB: A standardized protein folding database with temperature correction. Sci. Rep., 2019, 9(1), 1588.
[http://dx.doi.org/10.1038/s41598-018-36992-y] [PMID: 30733462]
[37]
Touw, W.G.; Baakman, C.; Black, J.; te Beek, T.A.; Krieger, E.; Joosten, R.P.; Vriend, G. A series of PDB-related databanks for everyday needs. Nucleic Acids Res., 2015, 43(Database issue), D364-D368.
[http://dx.doi.org/10.1093/nar/gku1028] [PMID: 25352545]
[38]
Kabsch, W.; Sander, C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 1983, 22(12), 2577-2637.
[http://dx.doi.org/10.1002/bip.360221211] [PMID: 6667333]
[39]
Reuveni, S.; Granek, R.; Klafter, J. Proteins: Coexistence of stability and flexibility. Phys. Rev. Lett., 2008, 100(20) 208101
[http://dx.doi.org/10.1103/PhysRevLett.100.208101] [PMID: 18518581]
[40]
Wei, L.; Su, R.; Luan, S.; Liao, Z.; Manavalan, B.; Zou, Q.; Shi, X. Iterative feature representations improve N4-methylcytosine site prediction. Bioinformatics, 2019. Epub ahead of print
[http://dx.doi.org/10.1093/bioinformatics/btz408] [PMID: 31099381]
[41]
Boopathi, V.; Subramaniyam, S.; Malik, A.; Lee, G.; Manavalan, B.; Yang, D.C. mACPpred: A support vector machine-based meta-predictor for identification of anticancer peptides. Int. J. Mol. Sci., 2019, 20(8) E1964
[http://dx.doi.org/10.3390/ijms20081964] [PMID: 31013619]
[42]
Basith, S.; Manavalan, B.; Shin, T.H.; Lee, G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput. Struct. Biotechnol. J., 2018, 16, 412-420.
[http://dx.doi.org/10.1016/j.csbj.2018.10.007] [PMID: 30425802]
[43]
Manavalan, B.; Shin, T.H.; Kim, M.O.; Lee, G. PIP-EL: A new ensemble learning method for improved proinflammatory peptide predictions. Front. Immunol., 2018, 9, 1783.
[http://dx.doi.org/10.3389/fimmu.2018.01783] [PMID: 30108593]
[44]
Morita, K.; Simons, E.R.; Blout, E.R. Polypeptides. 53. Water-soluble copolypeptides of L-glutamic acid, L-lysine, and L-alanine. Biopolymers, 1967, 5(3), 259-271.
[http://dx.doi.org/10.1002/bip.1967.360050304] [PMID: 6040032]
[45]
Rao, S.P.; Carlstrom, D.E.; Miller, W.G. Collapsed structure polymers. A scattergun approach to amino acid copolymers. Biochemistry, 1974, 13(5), 943-952.
[http://dx.doi.org/10.1021/bi00702a019] [PMID: 4813373]
[46]
Davidson, A.R.; Lumb, K.J.; Sauer, R.T. Cooperatively folded proteins in random sequence libraries. Nat. Struct. Biol., 1995, 2(10), 856-864.
[http://dx.doi.org/10.1038/nsb1095-856] [PMID: 7552709]
[47]
Riddle, D.S.; Santiago, J.V.; Bray-Hall, S.T.; Doshi, N.; Grantcharova, V.P.; Yi, Q.; Baker, D. Functional rapidly folding proteins from simplified amino acid sequences. Nat. Struct. Biol., 1997, 4(10), 805-809.
[http://dx.doi.org/10.1038/nsb1097-805] [PMID: 9334745]
[48]
Wang, J.; Wang, W. A computational approach to simplifying the protein folding alphabet. Nat. Struct. Biol., 1999, 6(11), 1033-1038.
[http://dx.doi.org/10.1038/14918] [PMID: 10542095]
[49]
Sicheri, F.; Yang, D.S. Ice-binding structure and mechanism of an antifreeze protein from winter flounder. Nature, 1995, 375(6530), 427-431.
[http://dx.doi.org/10.1038/375427a0] [PMID: 7760940]
[50]
Schafmeister, C.E.; LaPorte, S.L.; Miercke, L.J.; Stroud, R.M. A designed four helix bundle protein with native-like structure. Nat. Struct. Biol., 1997, 4(12), 1039-1046.
[http://dx.doi.org/10.1038/nsb1297-1039] [PMID: 9406555]
[51]
Akanuma, S.; Kigawa, T.; Yokoyama, S. Combinatorial mutagenesis to restrict amino acid usage in an enzyme to a reduced set. Proc. Natl. Acad. Sci. USA, 2002, 99(21), 13549-13553.
[http://dx.doi.org/10.1073/pnas.222243999] [PMID: 12361984]
[52]
Huang, J.T.; Wang, T.; Huang, S.R.; Li, X. Reduced alphabet for protein folding prediction. Proteins, 2015, 83(4), 631-639.
[http://dx.doi.org/10.1002/prot.24762] [PMID: 25641420]
[53]
Ullah, A.; Ahmed, N.; Pappu, S.D.; Shatabda, S.; Ullah, A.Z.; Rahman, M.S. Efficient conformational space exploration in ab initio protein folding simulation. Royal. Soc. Open Sci., 2015, 2(8) 150238
[http://dx.doi.org/10.1098/rsos.150238] [PMID: 26361554]
[54]
Shatabda, S.; Newton, M.A.; Rashid, M.A.; Pham, D.N.; Sattar, A. How good are simplified models for protein structure prediction? Adv. Bioinforma., 2014, 2014867179
[http://dx.doi.org/10.1155/2014/867179] [PMID: 24876837]
[55]
Lv, J.; Luo, L. Statistical analyses of protein folding rates from the view of quantum transition. Sci. China Life Sci., 2014, 57(12), 1197-1212.
[http://dx.doi.org/10.1007/s11427-014-4728-9] [PMID: 25266151]


Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 27
ISSUE: 4
Year: 2020
Page: [321 - 328]
Pages: 8
DOI: 10.2174/0929866526666191014152207
Price: $65

Article Metrics

PDF: 9