Background: Essential proteins play an important role in the process of life, which can be
identified by experimental methods and computational approaches. Experimental approaches to identify
essential proteins are of high accuracy but with the limitation of time and resource-consuming.
Objective: Herein, we present a computational model (PEPRF) to identify essential proteins based on
Methods: Different features of proteins were extracted. Topological features of Protein-Protein Interaction
(PPI) network-based are extracted. Based on the protein sequence, graph theory-based features, information-
based features, composition and physichemical features, etc., were extracted. Finally, 282
features are constructed. In order to select the features that contributed most to the identification, ReliefF-
based feature selection method was adopted to measure the weights of these features.
Results: As a result, 212 features were curated to train random forest classifiers. Finally, PEPRF get the
AUC of 0.71 and an accuracy of 0.742.
Conclusion: Our results show that PEPRF may be applied as an efficient tool to identify essential proteins.