Current Bioinformatics

Yi-Ping Phoebe Chen
Department of Computer Science and Information Technology
La Trobe University


Prediction of Colorectal Cancer Related Genes Based on Gene Ontology

Author(s): Bi-Qing Li, Guo-Hua Huang, Tao Huang, Kai-Yan Feng, Lei Liu, Yu-Dong Cai.

Graphical Abstract:


Prediction and identification of cancer related genes are among of the most challenging and important problems in bioinformatics and biomedicine. Colorectal cancer (CRC), the second most commonly diagnosed cancer worldwide, is a major cause of cancer-related death. Knowledge of CRC-related genes may help to make an early detection of CRC and develop gene-targeted treatment schemes to significantly improve a patient’s prognosis and reduce the mortality. The very first and basic steps one needs to take are the screening and identification of CRC-related genes. Here, we presented a computational method to predict CRC-related genes based on JRip, a rule abstracting algorithm, and optimized its data inputs by the maximum relevance minimum redundancy (mRMR) method and incremental feature selection (IFS). 77 genes were compiled from KEGG CRC pathway and through text mining as CRC-related gene candidates, while 385 other genes were randomly selected as the non-CRC gene candidates. All these 462 genes were encoded according to their Gene Ontology annotation, each producing a 2669-dimensional vector which was drastically reduced to 52 dimensions after feature selection. A rule set including 7 criteria was revealed by our method, yielding an overall prediction accuracy of 0.9242 and MCC of 0.7259. And analysis of the rule set and optimal features may shed some light on how CRC genes can be separated from non-CRC genes based on GO terms.

Keywords: Colorectal cancer, Gene Ontology, incremental feature selection, JRip, minimum redundancy maximum relevance.

Order Reprints Order Eprints Rights & PermissionsPrintExport

Article Details

Year: 2015
Page: [22 - 30]
Pages: 9
DOI: 10.2174/157489361001150309131058