Title:GENIRF: An Algorithm for Gene Regulatory Network Inference Using Rotation Forest
VOLUME: 13 ISSUE: 4
Author(s):Jamshid Pirgazi, Ali Reza Khanteymoori* and Maryam Jalilkhani
Affiliation:Department of Computer Engineering, Engineering Faculty, University of Zanjan, Zanjan, Department of Computer Engineering, Engineering Faculty, University of Zanjan, Zanjan, Department of Computer Engineering, Engineering Faculty, University of Zanjan, Zanjan
Keywords:Gene regulatory network, gene expression, rotation forest, singular value decomposition.
Abstract:Background: A central problem of systems biology is the reconstruction of the topology of
gene regulatory networks (GRNs) using high throughput genomic data like microarray gene expression
data. The main challenge in gene expression data is that the number of genes is high, number of samples
is low, and the data are often impregnated with noise.
Objective: In this paper, we present a method for Gene Regulatory Network Inference using Rotation
Forest (GENIRF).
Methods: The rotation forest will exploit the embedded variable ranking mechanism of tree-based
ensemble methods and dimension reduction. This feature solves the main challenge in gene expression
data. GENIRF decomposes the prediction of a gene regulatory network between p genes into p different
regression problems. Each regression problem is constructed with a transformed expression pattern and
rotation forest. The expression pattern of the target gene is predicted from the expression patterns of all
the remaining genes, using rotation forest.
Results: GENIRF does not make any hypotheses regarding the nature of gene regulation, so it can identify
combinatorial and non-linear interactions in GRN. Experimental results on the DREAM4 in silico
multifactorial challenge simulated data indicate that GENIRF has better accuracy and compares favorably
with existing well known algorithms. Furthermore, it is a fast and scalable method.
Conclusion: GENIRF shows high performance across this benchmark with different performance metrics
and the overall score of GENIRF is slightly better than other method. We have also shown that the
dimension reduction of the gene expression data can further improve the performance of GENIRF and
other methods. In addition, GENIRF is competitive in terms of computational efficiency, especially with
ensemble methods and for big data, our method can be easily parallelized.