ORFpred: A Machine Learning Program to Identify Translatable Small Open Reading Frames in Intergenic Regions of the Plasmodium falciparum Genome

Author(s): Vivek Srinivas , Mayank Kumar , Santosh Noronha , Swati Patankar .

Journal Name: Current Bioinformatics

Volume 11 , Issue 2 , 2016

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Motivation: Small Open Reading Frames (smORFs) are involved in a variety of cellular processes varying from metabolism to gene regulation and eukaryotic genomes have been predicted to contain a large number of smORFs. Only a meager 174 smORFs have been annotated in the genome of the human malaria parasite Plasmodium falciparum. Although millions of smORFs can be extracted from the parasite genome, the identification of translatable smORFs from the P. falciparum genome is a challenging task due to low accuracy of existing smORF predictors when applied to an AT biased genome.

Result: We developed ORFpred, a machine learning algorithm which calculates the probability of translation initiation and elongation of ORFs in the P. falciparum genome. ORFpred identified 2204 translatable smORFs and when compared to available predictors, showed higher accuracy. We believe that ORFpred will help in identification of probable protein coding smORFs in other eukaryotic genomes.

Availability and Implementation: Database used for training and testing the algorithm and source codes are freely available at http://www.bio.iitb.ac.in/~patankar/software/ORFpred.

Keywords: Small open reading frames, upstream open reading frames, translatability, low molecular weight proteins, post transcriptional gene regulation, Plasmodium falciparum, AT rich genome.

Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 11
ISSUE: 2
Year: 2016
Page: [259 - 268]
Pages: 10
DOI: 10.2174/1574893611666160122221757
Price: $58

Article Metrics

PDF: 19
HTML: 2
EPUB: 1
PRC: 1