Background: The global outbreak of the 2019 novel Coronavirus Disease (COVID-19) caused by the infection with the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), which appeared in China at the end of
2019, signifies a major public health issue at the current time.
Objective: The objective of the present study is to characterize the physicochemical properties of the SARS-CoV-2 proteins at a residues level, and to generate a “bioinformatics fingerprint” in the form of a “PIM® profile” created for each
sequence utilizing the Polarity Index Method® (PIM®), suitable for the identification of these proteins.
Methods: Two different bioinformatics approaches were used to analyze sequence characteristics of these proteins at
the residues level, an in-house bioinformatics system PIM®, and a set of the commonly used algorithms for the predic-tion of protein intrinsic disorder predisposition, such as PONDR® VLXT, PONDR® VL3, PONDR® VSL2, PONDR®
FIT, IUPred_short and IUPred_long. The PIM® profile was generated for four SARS-CoV-2 structural proteins and
compared with the corresponding profiles of the SARS-CoV-2 non-structural proteins, SARS-CoV-2 putative proteins,
SARS-CoV proteins, MERS-CoV proteins, sets of bacterial, fungal, and viral proteins, cell-penetrating peptides, and a
set of intrinsically disordered proteins. We also searched for the UniProt proteins with PIM® profiles similar to those of
SARS-CoV-2 structural, non-structural, and putative proteins.
Results: We show that SARS-CoV-2 structural, non-structural, and putative proteins are characterized by a unique
PIM® profile. A total of 1736 proteins were identified from the 562,253 “reviewed” proteins from the UniProt database,
whose PIM® profile was similar to that of the SARS-CoV-2 structural, non-structural, and putative proteins.
Conclusion: The PIM® profile represents an important characteristic that might be useful for the identification of proteins similar to SARS-CoV-2 proteins.