Background: SARS-nCOV-2 is a variant of the known SARS coronavirus family. The
mutations in viruses are very rapid and can play a crucial role in the evolution or devolution of the
organism. This has a direct impact on “host jumping” and the pathogenicity of the virus.
Objective: The study aims to understand the frequency of genomic variations that have occurred in
the virus affecting the Indian sub-population. The impact of variations translating to proteins and
its consequences affecting protein stability and interaction were studied.
Methods: Phylogenetic analysis of the 140 genomes from the India region was performed, followed
by SNP and Indel analysis of both CDS and non-CDS regions. This effort was followed by a
prediction of mutations occurring in 8 proteins of interest and the impact on protein stability and
prospective drug interactions.
Results: Genomes showed variability in origin, and major branches can be mapped to the 2002 outbreak
of SARS. The mutation frequency in CDS regions showed that 241 C >T, 3037 C >T, 2836
C >T, and 6312 C >A occurred in 81.5% of genomes mapping to major genes. Corresponding mutations
were mapped to protein sequences. The effect of mutations occurring in spike glycoprotein,
RNA dependent RNA polymerase, nsp8, nucleocapsid and 3c protease was also depicted.
Conclusion: Whilst the mutations in spike glycoprotein showcased an increase in protein stability,
the residues undergoing mutations were also a part of drug binding pockets for hydroxychloroquine.
Mutations occurring in other proteins of interest led to a decrease in protein stability. The
mutations were also a part of drug binding pockets for Favipiravir, Remdesivir and Dexamethasone.
The work allows analyzing larger datasets to understand mutation patterns globally.