Background: The recently developed single-cell RNA sequencing (scRNA-seq) has
attracted a great amount of attention due to its capability to interrogate expression of individual
cells, which is superior to traditional bulk cell sequencing that can only measure mean gene
expression of a population of cells. scRNA-seq has been successfully applied in finding new cell
subtypes. New computational challenges exist in the analysis of scRNA-seq data.
Objective: We provide an overview of the features of different similarity calculation and clustering
methods, in order to facilitate users to select methods that are suitable for their scRNA-seq. We
would also like to show that feature selection methods are important to improve clustering
Results: We first described similarity measurement methods, followed by reviewing some new
clustering methods, as well as their algorithmic details. This analysis revealed several new
questions, including how to automatically estimate the number of clustering categories, how to
discover novel subpopulation, and how to search for new marker genes by using feature selection
Conclusion: Without prior knowledge about the number of cell types, clustering or semisupervised
learning methods are important tools for exploratory analysis of scRNA-seq data.