Protein & Peptide Letters

Prof. Ben M. Dunn  
Department of Biochemistry and Molecular Biology
University of Florida
College of Medicine
P.O. Box 100245
Gainesville, FL


Biomedical Hypothesis Generation by Text Mining and Gene Prioritization

Author(s): Ingrid Petric, Balazs Ligeti, Balazs Gyorffy and Sandor Pongor

Affiliation: Centre for Systems and Information Technologies, University of Nova Gorica, Vipavska 13, SI-5000 Nova Gorica, Slovenia.

Keywords: Biomedical hypothesis generation, disease gene prediction, gene prioritization, ovarian cancer, text mining.


Text mining methods can facilitate the generation of biomedical hypotheses by suggesting novel associations between diseases and genes. Previously, we developed a rare-term model called RaJoLink (Petric et al, J. Biomed. Inform. 42(2): 219-227, 2009) in which hypotheses are formulated on the basis of terms rarely associated with a target domain. Since many current medical hypotheses are formulated in terms of molecular entities and molecular mechanisms, here we extend the methodology to proteins and genes, using a standardized vocabulary as well as a gene/protein network model. The proposed enhanced RaJoLink rare-term model combines text mining and gene prioritization approaches. Its utility is illustrated by finding known as well as potential gene-disease associations in ovarian cancer using MEDLINE abstracts and the STRING database.

Order Reprints Order Eprints Rights & PermissionsPrintExport

Article Details

Page: [847 - 857]
Pages: 11
DOI: 10.2174/09298665113209990063