Background: Causal mediation analysis is conducted in biomedical research with the
goal of investigating causal mechanisms that consist of both direct causal pathways between the
treatment and outcome variables and intermediate causal pathways through mediators. Recently,
this type of analysis has been applied in the context of bioinformatics; however, it encounters the
obstacle of high-dimensional and semi-continuous mediators with clumping at zero.
Methods: In this article, we develop a methodology to conduct high-dimensional causal mediation
analysis with a modeling framework that involves (i) a nonlinear model for the outcome variable,
(ii) two-part models for semi-continuous mediators with clumping at zero, and (iii) sophisticated
variable-selection techniques using machine learning. We conducted simulations and investigated
the performance of the proposed method. It is shown that the proposed method can provide reliable
statistical information on the causal effects with high-dimensional mediators. The method is
adopted to assess the contribution of the intestinal microbiome to the risk of bacterial pathogen
colonization in older adults from US nursing homes.
Conclusion: The proposed high-dimensional causal mediation analysis with nonlinear models is
an innovative and reliable approach to conduct causal inference with high-dimensional mediators.