Sequence-Based Methods for Real Value Predictions of Protein Structure
Recent years observed a growing interest in computational methods that predict and characterize protein structure due to the increasing sequence-structure gap. This includes a spike in development of sequence-based in-silico methods that address prediction of several newly formulated real-value descriptors of protein structure. These descriptors include B-factor, backbone torsion angles, solvent accessibility, residue depth, contact number, residue-wise contact order, secondary structure content, and folding rates. Although they address different structural aspects, such as exposure to the solvent, spatial position and packing of the residues, their flexibility, amount of secondary structures in the protein, and folding time, the methods that are built to address them share similarities that could be exploited to improve future designs. To date, no comprehensive overview that summarizes and contrasts solutions developed for these tasks was published. To address this we compare different designs of real-value predictors based on information concerning input data encoding and prediction algorithms used. We also investigate evaluation standards, which include benchmark datasets, test criteria, and test procedures used in these predictive tasks. Finally, we summarize application areas and problems that use the above-mentioned predictions. We believe that the breath and number of these applications justify further development of more accurate and integrated real-value prediction methods.
Keywords: Real-value prediction, protein structure, solvent accessibility, residue depth, contact number, residue-wise contact order, secondary structure content, backbone torsion angles, B-factor, folding rate
Rights & PermissionsPrintExport