Background: Rapid accumulation of genomic and transcriptomic data initiates the development of
computational methods to identify the regulation of transcriptional factors (TF) and genes. However, available
methods display high false-positive rate and unstable performance across different networks due to their preferences
for interactions with certain features. Model integration can reduce the biases of these methods and improve
the specificity, especially for the pairwise methods whose correlations are very low.
Objective: Different integration methods were compared in this analysis, and the best integration method will
Method: We applied integration of 14 different models categorized into five major groups, i.e. regression, mutual
information, correlation, Bayesian and others, to predict the simulated regulation networks extracted from
Escherichia coli at two different scales.
Results: We have found that support vector regression (SVR) method achieved the highest precision. While
one another method Cubist, was less precise than SVR but much more efficient especially in time cost. This
conclusion was also confirmed by simulated expression data from in silico Saccharomyces cerevisiae network
at three different scales and the real expression data from the sub-network of SOS DNA repair system. We applied
SVR to construct the network orchestrating cell envelope stress in B. licheniformis, and found that the
predicted network was consistent with the results of previous studies.
Conclusion: This study conducted and compared different integration methods, and found that SVR can better
meet the demand of higher precision followed by Cubist. The integration can provide more clear insights into
the transcriptional architecture.