Background: Protein folding rate is mainly determined by the size of the
conformational space to search, which in turn is dictated by factors such as size, structure and
amino-acid sequence in a protein. It is important to integrate these factors effectively to form a
more precisely description of conformation space. But there is no general paradigm to answer this
question except some intuitions and empirical rules. Therefore, at the present stage, predictions of
the folding rate can be improved through finding new factors, and some insights are given to the
Objective: Its purpose is to propose a new parameter that can describe the size of the
conformational space to improve the prediction accuracy of protein folding rate.
Methods: Based on the optimal set of amino acids in a protein, an effective cumulative backbone
torsion angles (CBTAeff) was proposed to describe the size of the conformational space. Linear
regression model was used to predict protein folding rate with CBTAeff as a parameter. The degree
of correlation was described by the coefficient of determination and the mean absolute error MAE
between the predicted folding rates and experimental observations.
Results: It achieved a high correlation (with the coefficient of determination of 0.70 and MAE of
1.88) between the logarithm of folding rates and the (CBTAeff)0.5 with experimental over 112 twoand
multi-state folding proteins.
Conclusion: The remarkable performance of our simplistic model demonstrates that CBTA based
on optimal set was the major determinants of the conformation space of natural proteins.