Background: Specific combinations of Histone Modifications (HMs) contributing towards
histone code hypothesis lead to various biological functions. HMs combinations have been utilized by
various studies to divide the genome into different regions. These study regions have been classified as
chromatin states. Mostly Hidden Markov Model (HMM) based techniques have been utilized for this
purpose. In case of chromatin studies, data from Next Generation Sequencing (NGS) platforms is being
used. Chromatin states based on histone modification combinatorics are annotated by mapping them to
functional regions of the genome. The number of states being predicted so far by the HMM tools have
been justified biologically till now.
Objective: The present study aimed at providing a computational scheme to identify the underlying
hidden states in the data under consideration.
Methods: We proposed a computational scheme HCVS based on hierarchical clustering and
visualization strategy in order to achieve the objective of study.
Results: We tested our proposed scheme on a real data set of nine cell types comprising of nine
chromatin marks. The approach successfully identified the state numbers for various possibilities. The
results have been compared with one of the existing models as well which showed quite good
Conclusion: The HCVS model not only helps in deciding the optimal state numbers for a particular data
but it also justifies the results biologically thereby correlating the computational and biological aspects.