Prediction and Analysis of Hepatocellular Carcinoma Related Genes Using Gene Ontology and KEGG

Author(s): Min Jiang , Bi-Qing Li , Tao Huang , Yao Chen Xu , Lei Gu , Xiang Yin Kong .

Journal Name: Current Bioinformatics

Volume 10 , Issue 1 , 2015

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Hepatocellular carcinoma (HCC) is the most common type of liver cancer worldwide and mostly occurs in viral hepatitis endemic areas such as China. Knowledge of HCC-related genes may lead to an early detection of HCC and develop molecularly targeted therapeutics, reducing mortality and improving a patient’s prognosis significantly. Therefore, it is valuable and important for us to identify common characters of HCC related genes. In this study, we proposed a computational method to predict HCC related genes based on Gene Ontology terms and KEGG terms using Random Forest (RF), in which features were optimized by maximum relevance minimum redundancy (mRMR) and incremental feature selection (IFS). 224 HCC gene candidates were compiled from some databases, while 11,200non-HCC gene candidates were randomly selected from Ensemble database. 10 candidate datasets were constructed by dividing non-HCC gene candidates into 10 groups. Each gene in datasets was encoded by 13,126 features including 12,887 Gene Ontology enrichment scores and 239 KEGG enrichment scores. Finally, an optimal feature set including 615 GO terms and 11 KEGG pathways was discovered. Through analysis, we found these features were closely related to HCC, which means our method is effective for discovering HCC related genes, and it is hopeful that it can also be used to predict and analyze genes for other types of cancer.

Keywords: Gene ontology, hepatocellular carcinoma (HCC), incremental feature selection (IFS), KEGG, maximum relevance minimum redundancy (mRMR), random forest (RF).

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 10
ISSUE: 1
Year: 2015
Page: [31 - 38]
Pages: 8
DOI: 10.2174/157489361001150309131453

Article Metrics

PDF: 40
HTML: 1
EPUB: 1