Cluster analysis of coronavirus sequences using computational sequence descriptors: With applications to SARS, MERS and SARS-CoV-2 (CoVID-19)

(E-pub Abstract Ahead of Print)

Author(s): Marjan Vračko, Subhash C. Basak*, Tathagata Dey, Ashesh Nandy

Journal Name: Current Computer-Aided Drug Design

Become EABM
Become Reviewer
Call for Editor


Background: Study of 573 genome sequences belonging to SARS, MERS and SARS-CoV-2 (CoVID-19) viruses.

Objective: To compare the virus sequences, which originate from different places around the world.

Methods: Alignment free methods for representation of sequences and chemometrical methods for analyzing of clusters.

Results: Majority of genome sequences are clustered with respect on virus type, but some of them are outliers.

Conclusion: We indicate 71 sequences, which tend to belong to more than cluster.

Keywords: SARS-CoV-2 (CoVID-19), SARS, MERS, mathematical representation of sequences, clustering, Euclidean distance, Mahalanobis distance, principal component analysis, alignment-free sequenc descriptors.

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Abstract Ahead of Print)
DOI: 10.2174/1573409917666210202092646
Price: $95