Title:Prediction of Colorectal Cancer Related Genes Based on Gene Ontology
VOLUME: 10 ISSUE: 1
Author(s):Bi-Qing Li, Guo-Hua Huang, Tao Huang, Kai-Yan Feng, Lei Liu and Yu-Dong Cai
Affiliation:Institute of Systems Biology, Shanghai University, Shanghai, P.R. China.
Keywords:Colorectal cancer, Gene Ontology, incremental feature selection, JRip, minimum redundancy maximum relevance.
Abstract:Prediction and identification of cancer related genes are among of the most challenging
and important problems in bioinformatics and biomedicine. Colorectal cancer (CRC), the second
most commonly diagnosed cancer worldwide, is a major cause of cancer-related death. Knowledge
of CRC-related genes may help to make an early detection of CRC and develop gene-targeted
treatment schemes to significantly improve a patient’s prognosis and reduce the mortality. The
very first and basic steps one needs to take are the screening and identification of CRC-related
genes. Here, we presented a computational method to predict CRC-related genes based on JRip, a
rule abstracting algorithm, and optimized its data inputs by the maximum relevance minimum
redundancy (mRMR) method and incremental feature selection (IFS). 77 genes were compiled
from KEGG CRC pathway and through text mining as CRC-related gene candidates, while 385
other genes were randomly selected as the non-CRC gene candidates. All these 462 genes were
encoded according to their Gene Ontology annotation, each producing a 2669-dimensional vector which was drastically
reduced to 52 dimensions after feature selection. A rule set including 7 criteria was revealed by our method, yielding an
overall prediction accuracy of 0.9242 and MCC of 0.7259. And analysis of the rule set and optimal features may shed
some light on how CRC genes can be separated from non-CRC genes based on GO terms.