Background: SARS-nCOV-2 is a variant of the known SARS coronavirus family. The mutations in viruses are
very rapid and can play a crucial role in the evolution or devolution of the organism. This has a direct impact on “host
jumping” and pathogenicity of the virus.
Objective: The study aims to understand the frequency of genomic variations that have occurred in the virus affecting the
Indian sub-population. The impact of variations translating to proteins and its consequences affecting protein stability and
interaction were studied.
Method: Phylogenetic analysis of the 140 genomes from the India region was performed, followed by SNP and Indel
analysis of both CDS and non-CDS regions. This effort was followed by a prediction of mutations occurring in 8 proteins of
interest and the impact on protein stability and prospective drug interactions.
Results: Genomes showed variability in origin, and major branches can be mapped to the 2002 outbreak of SARS. The
mutation frequency in CDS regions showed that 241 C >T, 3037 C >T, 2836 C >T, and 6312 C >A occurred in 81.5 % of
genomes mapping to major genes. Corresponding mutations were mapped to protein sequences. The effect of mutations
occurring in spike glycoprotein, RNA dependent RNA polymerase, nsp8, nucleocapsid and 3c protease were also mapped.
Conclusion: Whilst the mutations in spike glycoprotein showcased an increase in protein stability, the residues undergoing
mutations were also a part of drug binding pockets for hydroxychloroquine. Mutations occurring in other proteins of interest
were leading to a decrease in protein stability. The mutations were also a part of drug binding pockets for Favipiravir,
Remdesivir and Dexamethasone. The work promotes the will to analyse larger datasets to understand mutation pattern