There are several effects of interest in a mediation analysis, relating to which pathway (direct/indirect) and node (treatment/mediator) we wish to consider intervening on and if we want to imagine keeping some aspect of the treatment fixed at a baseline/control level. Some notation, there are two treatment levels with an outcome , and potential outcome , the outcome observed if we set . By consistency, in our observed data if and similarly for . The mediator also has potential outcomes and . Within mediation analysis there is a second potential outcome that arises if we consider setting both and to particular values. This potential outcome also allows us consider questions such as: what value would the outcome take if an individual is treated but the treatment-mediator pathway is "broken" denoted as . Other variables will be denoted , , ... as required. I generally assume some level of familiarity with causal inference, a good introduction is Hernan and Robin's book.
Some other examples of the type of questions for which causal mediation analysis is useful:
There are several quantities (statistical/causal estimands) of interest in a mediation analysis. The naming conventions are different depending on the literature I've read and here I stick with Pearl (2014).
The total effect is the change in the outcome if we flip the treatment switch and aren't concerned with the mechanism of action. It is the usual average treatment effect figure we might expect to see in the headline results of an RCT. We could collapse the graph below into .
The natural direct effect is the effect of flipping the treatment switch if we imagine that the indirect pathway is no longer operational.
The natural direct effect (or average mediated effect) is the effect of flipping the treatment switch if we imagine that the direct pathway is no longer operational.
Each of these effects may be useful for different purposes. For example, the total effect may guide immediate decision making and policy - if a treatment works and is immediately needed the mechanism of action is less important. The size of the indirect effect is useful information for considering alternative (e.g. cheaper) treatments that target the mediator. The value is considered the percentage of the total effect explained by the mediator.
One of the important considerations in any causal analysis is exchangeability/ignorability. Also referred to as unconfoundedness, we can think of exchangeability as meaning that individuals in either treatment arm are literally a-priori exchangeable or "swappable", with the conditionaly exchangeability meaning that individuals are swappable within strata of a covariates X. We want our analysis to be comparable to a RCT, you could have ended up in either treatment arm. What exchangeability achieves is a lack of dependence between the treatment assignment and the potential outcome under that treatment . Otherwise you end up with flawed analysis.
For example, assume that in truth exercise reduces risk of hospitalisations due to asthma , and we wish in practise to investigate the link using survey of all asthmatics who have attended a clinic. However, only mild asthmatics do any exercise training () and already have fairly low risk of hospitalisations. So a naive analysis might find that exercise increases risk of hospitalisations. We have set up a scenario where our naive treatment estimator cannot equal the true treatment effect . Clearly severity of asthma would be an important adjustment and we might be happy to consider treatment assigment random within levels of an asthma severity measure leading to conditional exchangeability .
Now that we've recapped exchangeability in general lets consider it for mediation. In particular I'm going to talk about RCTs and so will assume the initial treatment is fully randomised and unconfounded. An issue here is rather simply that we've randomised only the treatment and not the mediator, and so any mediation analysis can still be confounded. For example, we could randomise exercise training to assess if that reduces asthma hospitalisations, with the potential mechanism of interest being a reduction in inflammation. However, maybe our study is in a district with poor industrial pollution controls. Some individuals in our study happen to live near a factory that is unbeknownst to them leaking a pollutant that raises lung inflammtion and increasing their our risk of asthma hospitalisation. As a result we have a partially confounded analysis, there is an unrecorded factor - proximity to the factory - that we won't account for in the analysis. What will happen then is that estimates of the indirect and direct effects will be biased away from the true effect.
Let's investigate this issue around confounding and mediation analysis using some simple linear forms for our data generation process and models. Feel free to skim the maths, all that matters is that the effects of interest turn out to be coefficients we can easily extract from a linear model.
First lets clarify what we are attempting to estimate.
Assuming no confounding in our generative model we can estimate the total, direct and indirect effects using estimators of the following quantities, for the total effect we have
And the direct effect we have
And the indirect effect we have
We'll estimate these coefficients using R's lm
function. Obviously in reality we'd need to worry about whether a linear model is the appropriate functional form for our analyses, see Pearl (2014) for details more general versions of these formulas. From the graph below we see that in the unconfounded case we have unbiased estimates of our parameters - as expected!
mediation_scen0 <- function(N,alpha,beta,gamma) {
A <- rbinom(N,1,0.5)
M <- alpha*A + rnorm(N)
Y <- beta*A + gamma*M + rnorm(N)
alpha_ <- mean(M[A==1]) - mean(M[A==0])
mod <- lm(Y ~ A + M)
beta_ <- as.numeric(coef(mod)["A"])
gamma_ <- as.numeric(coef(mod)["M"])
tau_ <- mean(Y[A==1]) - mean(Y[A==0])
c("total" = beta_ + alpha_*gamma_,
"direct" = beta_,
"indirect" = alpha_*gamma_)
}
We now make M and Y to be shared caused of another variable . This results in biased estimates of the direct and indirect effect as seen in the graph below.
In this case the our estimate of the indirect effect is biased, too high, i.e. generally while the direct effect is too low. If is >0 (<0) then both M and Y are more likely to take a higher (lower) value which gets absorbed into the estimate.
mediation_scen1 <- function(N,alpha,beta,gamma) {
A <- rbinom(N,1,0.5)
U <- rnorm(N)
M <- alpha*A + U + rnorm(N)
Y <- beta*A + gamma*M + U + rnorm(N)
alpha_ <- mean(M[A==1]) - mean(M[A==0])
mod <- lm(Y ~ A + M)
beta_ <- as.numeric(coef(mod)["A"])
gamma_ <- as.numeric(coef(mod)["M"])
tau_ <- mean(Y[A==1]) - mean(Y[A==0])
c("total" = beta_ + alpha_*gamma_,
"direct" = beta_,
"indirect" = alpha_*gamma_)
}
If we knew there was confounding of and by and we measured we could estimate the direct and indirect effects while controlling for . This would fix our biases - see the graph!
mediation_scen2 <- function(N,alpha,beta,gamma) {
A <- rbinom(N,1,0.5)
U <- rnorm(N)
M <- alpha*A + U + rnorm(N)
Y <- beta*A + gamma*M + U + rnorm(N)
alpha_ <- mean(M[A==1]) - mean(M[A==0])
mod <- lm(Y ~ A + M + U)
beta_ <- as.numeric(coef(mod)["A"])
gamma_ <- as.numeric(coef(mod)["M"])
tau_ <- mean(Y[A==1]) - mean(Y[A==0])
c("total" = beta_ + alpha_*gamma_,
"direct" = beta_,
"indirect" = alpha_*gamma_)
}
This post was a quick introduction to mediation analysis and one of the potential issues that can crop up - confounding of the mediator and outcome. Measure your confounders, achieve anything. Thanks for reading.
Pearl, J. (2014). Interpretation and identification of causal mediation. Psychological methods, 19(4), 459: https://ftp.cs.ucla.edu/pub/stat_ser/r389.pdf