Current Bioinformatics

Yi-Ping Phoebe Chen
Department of Computer Science and Information Technology
La Trobe University


Distinctive Phenotype Identification for Breast Cancer Genotypes Among Hereditary Breast Cancer Mutated Genes

Author(s): Md. Rafiul Hassan, Imran ul Haq, Emad Ramadan, Joarder Kamruzzaman, Adel F. Ahmed.

Graphical Abstract:


It is well known that the mutations in BRCA1 or BRCA2 gene can cause the hereditary breast cancer. However, it is a tedious and expensive task to identify the mutant genes that impact breast cancer due to the large number of genes and very small number of samples. Furthermore, the expressive energy of the subset of genes in comparison to that of one individual gene at a time is considered to have a profound influence in case of breast cancer. In this paper 7 tumors with BRCA1 mutation and 8 tumors with BRCA2 mutation have been used to identify the subset of discriminative genes. A combination of a non-parametric supervised and an unsupervised statistical method is introduced to analyze the gene expressions and the distinctive genes among the highly expressed genes are identified. The most important genes are filtered using the area under the curve (AUC) measure. These filtered genes are then used to build a hidden Markov model (HMM) to analyse their inter-relationship and identify the best subset among them. In addition, Protein-Protein interaction network is generated to analyse the pathways of the identified genes and their link with BRCA1 or BRCA2. Transcription Factors are identified and Gene Set Enrichment Analysis (GSEA) is calculated for the identified genes subset and the results are compared with the results mentioned in other cancer literature. Experimental results suggest that only 8 genes have been identified out of 3226 genes by the proposed hybrid method. Out of the 8 identified genes, 5 have been linked with breast cancer by other studies. Moreover, 7 genes have been associated with numerous diseases that may result in breast cancer. Furthermore, 8 transcription factors were identified that cover the identified genes and BRCA1 and BRCA2. Lastly, GSEA enrichment score of 0.52 is calculated for the identified genes and it is comparatively better considering the small subset of identified genes.

Keywords: Breast cancer, HMM, BRCA1, BRCA2, gene selection, hereditary.

Article Details

Year: 2015
Page: [5 - 15]
Pages: 11
DOI: 10.2174/157489361001150309121435