Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Bayesian Functional Mixed-effects Models with Grouped Smoothness for Analyzing Time-course Gene Expression Data

Author(s): Shangyuan Ye, Ye Liang and Bo Zhang*

Volume 16, Issue 1, 2021

Published on: 20 May, 2020

Page: [2 - 12] Pages: 11

DOI: 10.2174/1574893615999200520082636

Price: $65

Abstract

Objective: As a result of the development of microarray technologies, gene expression levels of thousands of genes involved in a given biological process can be measured simultaneously, and it is important to study their temporal behavior to understand their mechanisms. Since the dependence between gene expression levels over time for a given gene is often too complicated to model parametrically, sparse functional data analysis has received an increasing amount of attention for analyzing such data.

Methods: We propose a new functional mixed-effects model for analyzing time-course gene expression data. Specifically, the model groups individual functions with heterogeneous smoothness. The proposed method utilizes the mixed-effects model representation of penalized splines for both the mean function and the individual functions. Given noninformative or weakly informative priors, Bayesian inference on the proposed models was developed, and Bayesian computation was implemented by using Markov chain Monte Carlo methods.

Results: The performance of our new model was studied by two simulation studies and illustrated using a yeast cell cycle gene expression dataset. Simulation results suggest that our proposed methods can outperform the previously used methods in terms of the mean integrated squared error. The yeast gene expression data application suggests that the proposed model with two latent groups should be used on this dataset.

Conclusion: The new Bayesian functional mixed-effects model that assumes multiple groups of functions with different smoothing parameters provides an enhanced approach to analyzing timecourse gene expression data.

Keywords: Time-course gene expression data, Bayesian, functional data analysis, mixed-effects models, grouped smoothness, microarray.

Graphical Abstract
[1]
Brown PO, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet 1999; 21(1s): 33-7.
[http://dx.doi.org/10.1038/4462] [PMID: 9915498]
[2]
Nicholson JK, Connelly J, Lindon JC, Holmes E. Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov 2002; 1(2): 153-61.
[http://dx.doi.org/10.1038/nrd728] [PMID: 12120097]
[3]
Spellman PT, Sherlock G, Zhang MQ, et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998; 9(12): 3273-97.
[http://dx.doi.org/10.1091/mbc.9.12.3273] [PMID: 9843569]
[4]
Coffey N, Hinde J. Analyzing time-course microarray data using functional data analysis a review. Stat Appl Genet Mol Biol 2011; 10(1): 23.
[http://dx.doi.org/10.2202/1544-6115.1671]
[5]
Leng X, Müller H-G. Classification using functional data analysis for temporal gene expression data. Bioinformatics 2006; 22(1): 68-76.
[http://dx.doi.org/10.1093/bioinformatics/bti742] [PMID: 16257986]
[6]
Song JJ, Lee HJ, Morris JS, Kang S. Clustering of time-course gene expression data using functional data analysis. Comput Biol Chem 2007; 31(4): 265-74.
[http://dx.doi.org/10.1016/j.compbiolchem.2007.05.006] [PMID: 17631419]
[7]
Luan Y, Li H. Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 2003; 19(4): 474-82.
[http://dx.doi.org/10.1093/bioinformatics/btg014] [PMID: 12611802]
[8]
Kim J, Kim H. Partitioning of functional gene expression data using principal points. BMC Bioinformatics 2017; 18(1): 450.
[http://dx.doi.org/10.1186/s12859-017-1860-0] [PMID: 29025390]
[9]
Wang L, Zhou J, Qu A. Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics 2012; 68(2): 353-60.
[http://dx.doi.org/10.1111/j.1541-0420.2011.01678.x] [PMID: 21955051]
[10]
Claridge-Chang A, Wijnen H, Naef F, Boothroyd C, Rajewsky N, Young MW. Circadian regulation of gene expression systems in the Drosophila head. Neuron 2001; 32(4): 657-71.
[http://dx.doi.org/10.1016/S0896-6273(01)00515-3] [PMID: 11719206]
[11]
Peng X, Karuturi RK, Miller LD, et al. Identification of cell cycle-regulated genes in fission yeast. Mol Biol Cell 2005; 16(3): 1026-42.
[http://dx.doi.org/10.1091/mbc.e04-04-0299] [PMID: 15616197]
[12]
Breyne P, Zabeau M. Genome-wide expression analysis of plant cell cycle modulated genes. Curr Opin Plant Biol 2001; 4(2): 136-42.
[http://dx.doi.org/10.1016/S1369-5266(00)00149-7] [PMID: 11228436]
[13]
Cho RJ, Huang M, Campbell MJ, et al. Transcriptional regulation and function during the human cell cycle. Nat Genet 2001; 27(1): 48-54.
[http://dx.doi.org/10.1038/83751] [PMID: 11137997]
[14]
Ramsay JO, Silverman BW. Functional data analysis. New York: Springer 2005.
[http://dx.doi.org/10.1007/b98888]
[15]
de Boor C. On calculating with B-splines. J Approx Theory 1972; 6(1): 50-62.
[http://dx.doi.org/10.1016/0021-9045(72)90080-9]
[16]
Wahba G. Spline models for observational data. Siam 1990; 59: 181.
[http://dx.doi.org/10.1137/1.9781611970128]
[17]
Green PJ, Silverman BW. Nonparametric regression and generalized linear models. Chapman Hall 1994.
[http://dx.doi.org/10.1007/978-1-4899-4473-3]
[18]
Ruppert D, Wand WP, Carroll RJ. Semiparametric regression. Cambridge University Press 2003.
[http://dx.doi.org/10.1017/CBO9780511755453]
[19]
Shi M, Weiss RE, Taylor JM. An analysis of paediatric CD4 counts for acquired immune deficiency syndrome using flexible random curves. Appl Stat 1996; 151-63.
[http://dx.doi.org/10.2307/2986151]
[20]
Robinson GK. That BLUP is a good thing: The estimation of random effects. Stat Sci 1991; 6(1): 15-32.
[http://dx.doi.org/10.1214/ss/1177011926]
[21]
Rice JA, Wu CO. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics 2001; 57(1): 253-9.
[http://dx.doi.org/10.1111/j.0006-341X.2001.00253.x] [PMID: 11252607]
[22]
Wu H, Zhang J. Nonparametric regression methods for longitudinal data analysis: Mixed-effects modeling approaches. New Jersey: Wiley 2006.
[23]
Thompson WK, Rosen O. A Bayesian model for sparse functional data. Biometrics 2008; 64(1): 54-63.
[http://dx.doi.org/10.1111/j.1541-0420.2007.00829.x] [PMID: 17573864]
[24]
Brumback BA, Rice JA. Smoothing spline models for the analysis of nested and crossed samples of curves. J Am Stat Assoc 1998; 93(443): 961-76.
[http://dx.doi.org/10.1080/01621459.1998.10473755]
[25]
Guo W. Functional mixed effects models. Biometrics 2002; 58(1): 121-8.
[http://dx.doi.org/10.1111/j.0006-341X.2002.00121.x] [PMID: 11890306]
[26]
Berk M. Statistical methods for replicated, high-dimensional biological time series 2012.
[27]
Ruppert D. Selecting the number of knots for penalized splines. J Comput Graph Stat 2002; 11(23): 735-57.
[http://dx.doi.org/10.1198/106186002853]
[28]
Durbán M, Harezlak J, Wand MP, Carroll RJ. Simple fitting of subject-specific curves for longitudinal data. Stat Med 2005; 24(8): 1153-67.
[http://dx.doi.org/10.1002/sim.1991] [PMID: 15568201]
[29]
Crainiceanu CM, Goldsmith AJ. Bayesian functional data analysis using WinBUGS. J Stat Softw 2010; 32(11): i11.
[http://dx.doi.org/10.18637/jss.v032.i11] [PMID: 21743798]
[30]
Yao F, Muller H, Wang J. Functional data analysis for sparse longitudinal data. J Am Stat Assoc 2005; 100(470): 577-90.
[http://dx.doi.org/10.1198/016214504000001745]
[31]
Paul D, Peng J. Consistency of restricted maximum likelihood estimators of principal components. Ann Stat 2009; 37(3): 1229-71.
[http://dx.doi.org/10.1214/08-AOS608]
[32]
Peng J, Paul D. A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data. J Comput Graph Stat 2009; 18(4): 995-1015.
[http://dx.doi.org/10.1198/jcgs.2009.08011]
[33]
Cai T, Yuan M. Nonparametric covariance function estimation for functional and longitudinal data. University of Pennsylvania and Georgia Inistitute of Technology 2010.
[34]
Xiao L, Li C, Checkley W, Crainiceanu C. Fast covariance estimation for sparse functional data. Stat Comput 2017; 28: 511-22.
[PMID: 29449762]
[35]
Pinheiro JC, Bates DM. Approximations to the log-likelihood function in the nonlinear mixed-effects model. J Comput Graph Stat 1995; 4(1): 12-35.
[36]
Pinheiro JC, Chao EC. Efficient Laplacian and Adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. J Comput Graph Stat 2006; 15(1): 58-81.
[http://dx.doi.org/10.1198/106186006X96962]
[37]
Zhang B, Liu W, Hu Y. Estimating marginal and incremental effects in the analysis of medical expenditure panel data using marginalized two-part random-effects generalized Gamma models: Evidence from China healthcare cost data. Stat Methods Med Res 2018; 27(10): 3039-61.
[http://dx.doi.org/10.1177/0962280217690770] [PMID: 28139170]
[38]
Wand MP, Ormerod JT. On semiparametric regression with O’Sullivan penalized splines. Aust N Z J Stat 2009; 50: 179-98.
[http://dx.doi.org/10.1111/j.1467-842X.2008.00507.x]
[39]
Rice J, Silverman B. Estimating the mean and covariance structure nonparametrically when the data are curves. J R Stat Soc B 1991; 53: 233-43.
[http://dx.doi.org/10.1111/j.2517-6161.1991.tb01821.x]
[40]
Gelman A. Prior distributions for variance parameters in hierarchical models (Comment on Article by Browne and Draper). Bayesian Anal 2006; 1(3): 515-34.
[http://dx.doi.org/10.1214/06-BA117A]
[41]
Wand MP, Ormerod JT, Padoan SA, Frühwirth R. Mean field variational Bayes for elaborate distributions. Bayesian Anal 2011; 6(4): 847-900.
[http://dx.doi.org/10.1214/11-BA631]
[42]
Huang A, Wand MP. Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal 2013; 8(2): 439-52.
[http://dx.doi.org/10.1214/13-BA815]
[43]
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis. CRC press 2013.
[http://dx.doi.org/10.1201/b16018]
[44]
Berk M. "sme: Smoothing-splines mixed-effects models" R package version 08 h See https://CRAN.R-project.org/package=sme. 2013
[45]
Reiss PT, Huang L, Mennes M. Fast function on scalar regression with penalized basis expansions. Int J Biostat 2010; 6(1): 28.
[http://dx.doi.org/10.2202/1557-4679.1246] [PMID: 21969982]
[46]
Faes F, Ormerod JT, Wand MP. Variational Bayesian inference for parametric and nonparametric regression with missing data. J Am Stat Assoc 2011; 106(495): 959-71.
[http://dx.doi.org/10.1198/jasa.2011.tm10301]
[47]
Andrieu C, Doucet A. Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump Mcmc. IEEE Trans Signal Process 1999; 47(10): 2667-76.
[http://dx.doi.org/10.1109/78.790649]

Rights & Permissions Print Export Cite as
© 2022 Bentham Science Publishers | Privacy Policy