Title:Distinctive Phenotype Identification for Breast Cancer Genotypes Among Hereditary Breast Cancer Mutated Genes
VOLUME: 10 ISSUE: 1
Author(s):Md. Rafiul Hassan, Imran ul Haq, Emad Ramadan, Joarder Kamruzzaman and Adel F. Ahmed
Affiliation:Department of Information and Computer Science, King Fahd University of Petroleum and Minerals, P.O. Box: 31261, Dhahran, Saudi Arabia.
Keywords:Breast cancer, HMM, BRCA1, BRCA2, gene selection, hereditary.
Abstract:It is well known that the mutations in BRCA1 or BRCA2 gene can cause the hereditary breast
cancer. However, it is a tedious and expensive task to identify the mutant genes that impact breast cancer
due to the large number of genes and very small number of samples. Furthermore, the expressive energy
of the subset of genes in comparison to that of one individual gene at a time is considered to have a
profound influence in case of breast cancer. In this paper 7 tumors with BRCA1 mutation and 8 tumors
with BRCA2 mutation have been used to identify the subset of discriminative genes. A combination of a
non-parametric supervised and an unsupervised statistical method is introduced to analyze the gene
expressions and the distinctive genes among the highly expressed genes are identified. The most important
genes are filtered using the area under the curve (AUC) measure. These filtered genes are then used to
build a hidden Markov model (HMM) to analyse their inter-relationship and identify the best subset
among them. In addition, Protein-Protein interaction network is generated to analyse the pathways of the identified genes and
their link with BRCA1 or BRCA2. Transcription Factors are identified and Gene Set Enrichment Analysis (GSEA) is calculated
for the identified genes subset and the results are compared with the results mentioned in other cancer literature. Experimental
results suggest that only 8 genes have been identified out of 3226 genes by the proposed hybrid method. Out of the 8 identified
genes, 5 have been linked with breast cancer by other studies. Moreover, 7 genes have been associated with numerous diseases
that may result in breast cancer. Furthermore, 8 transcription factors were identified that cover the identified genes and BRCA1
and BRCA2. Lastly, GSEA enrichment score of 0.52 is calculated for the identified genes and it is comparatively better
considering the small subset of identified genes.