Feature Classification and Analysis of Lung Cancer Related Genes through Gene Ontology and KEGG Pathways

Author(s): You Zhou, Biqing Li, Yuchao Zhang, Lei Chen, Xiangyin Kong.

Journal Name: Current Bioinformatics

Volume 11 , Issue 1 , 2016

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Characterization of cancer related genes is important and challenging in both biomedicine and computational biology. As one of the leading causes of cancer mortality worldwide, lung cancer accounts for over one million deaths each year. Generally, lung cancer can be assigned to small-cell lung cancer (SCLC) and non-small-cell lung cancer (NSCLC). Although great advances have been made in lung cancer detection and treatment, 5-year survival rate of patients is still less than 15%. Hence, it is very important to identify all the potential lung cancer related genes as well as their interaction networks. In this research, we presented a novel computational framework to predict lung cancer related genes based on support vector machine (SVM). 59 NSCLC related genes and 89 SCLC related genes were retrieved from KEGG pathways, while 2950 non-NSCLC and 4450 non- SCLC genes were randomly selected from Ensembl database. 10 datasets were constructed by dividing the genes into 10 groups. Each gene was encoded by a 13,126-dimensional vector comprised of 12,887 Gene Ontology enrichment scores and 239 KEGG enrichment scores. A feature extraction strategy was applied to obtain an optimal feature set including 400 GO terms and 47 KEGG pathways for NSCLC, 458 GO terms and 27 KEGG pathways for SCLC, respectively. Further feature analysis showed that these optimal features were actively involved in lung tumorigenesis. It also confirms that our method is an effective tool for predicting cancer related genes and has the potential to be applied extensively to the prediction of other types of cancer genes.

Keywords: Non-small-cell lung cancer (NSCLC), small-cell lung cancer (SCLC), Gene Ontology (GO), KEGG pathways, support vector machine (SVM), maximum relevance minimum redundancy (mRMR), incremental feature selection (IFS).

Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 11
ISSUE: 1
Year: 2016
Page: [40 - 50]
Pages: 11
DOI: 10.2174/1574893611666151119220803

Article Metrics

PDF: 42
HTML: 2
EPUB: 1
PRC: 1