Background: Intrinsically disordered proteins lack a well-defined three-dimensional structure
under physiological conditions. They have performed multiple functions in life activities and are closely
related to many human diseases. The identification of the disordered region of intrinsically disordered
proteins is important to protein function annotation.
Objective: To accurately identify the disordered regions in intrinsically disordered proteins.
Methods: In this study, we constructed a multi-feature fusion model based on a support vector machine to
predict disordered regions of intrinsically disordered proteins from the DisPort database. We extracted
codons usage frequencies, GC content, protein secondary structure components, hydrophilic-hydrophobic
amino acid components, and chemical shifts as features to predict the disordered regions of intrinsically
Results: The best accuracy is 82.098% by using codon frequencies in single feature prediction. In order to
improve the performance, we fused these features and obtained the best result of 83.173% in combining
codons frequencies with chemical shifts as the feature.
Conclusion: The results show that our model has achieved a good prediction result in predicting disordered
regions of intrinsically disordered proteins-moreover, the performances of our model are better than
those of existing methods.