Abstract
Background: Causal mediation analysis is conducted in biomedical research with the goal of investigating causal mechanisms that consist of both direct causal pathways between the treatment and outcome variables and intermediate causal pathways through mediators. Recently, this type of analysis has been applied in the context of bioinformatics; however, it encounters the obstacle of high-dimensional and semi-continuous mediators with clumping at zero.
Methods: In this article, we develop a methodology to conduct high-dimensional causal mediation analysis with a modeling framework that involves (i) a nonlinear model for the outcome variable, (ii) two-part models for semi-continuous mediators with clumping at zero, and (iii) sophisticated variable-selection techniques using machine learning. We conducted simulations and investigated the performance of the proposed method. It is shown that the proposed method can provide reliable statistical information on the causal effects with high-dimensional mediators. The method is adopted to assess the contribution of the intestinal microbiome to the risk of bacterial pathogen colonization in older adults from US nursing homes.
Conclusion: The proposed high-dimensional causal mediation analysis with nonlinear models is an innovative and reliable approach to conduct causal inference with high-dimensional mediators.
Keywords: Causal inference, mediators, linear structural equation modeling, nonlinear models, microbiome, multidrug resistance, pathogen colonization, nursing home.