Outcome regression

In this post, we consider the task of estimating treatment effects. This is the basic problem in causal inference, and it arises in a many areas of science and engineering. As a running example, we consider the task of estimating the efficacy of a vaccine booster. We begin by mathematically defining treatment effects using the potential outcomes framework.

To keep things simple, we focus on estimating the effect of a binary treatment (e.g. booster vs no booster). We define two potential outcomes $Y_i(1)$ and $Y_i(0)$ for each subject in the study. In the running example, $Y_i(1)$ is the viral load in the $i$-subject if the subject got the booster, and $Y_i(0)$ is the viral load if the subject did not get the booster. The effect of the treatment on the $i$-th subject is

\[\Delta_i \triangleq Y_i(1) - Y_i(0).\]

The fundamental challenge in causal inference is only one treatment can be assigned to a subject, so only one of $Y_i(1)$ and $Y_i(0)$ can be observed. Thus $\Delta_i$ is never observed. Nevertheless, it is possible (as we shall see) to estimate the average treatment effect (ATE)

$\tau \triangleq \Ex\big[\Delta_i\big] = \Ex\big[Y_i(1)\big] - \Ex\big[Y_i(0)\big]$ by performing randomized experiments.

In a randomized experiment, we randomly assign treatments to the subjects and record the outcomes. Let $W_i\in\{0,1\}$ and $Y_i$ be the treatment assignment and observed outcome of the $i$-th subject. In the running example, $W_i$ indicates whether the $i$-th subject got the booster and $Y_i$ is the (observed) viral load in the $i$-th subject. Mathematically, in a randomized experiment, we have

\[\begin{aligned} Y_i = Y_i(W_i) && \text{(SUTVA),}\\ (Y_i(1),Y_i(0)) \ind W_i && \text{(random treatment assignment).} \end{aligned}\]

The first condition (SUTVA) relates the observed outcomes to the potential outcomes: the observed outcome of the $i$-subject $Y_i$ is $Y_i(1)$ (resp $Y_i(0)$) if $W_i = 1$ (resp $W_i = 0$). The second condition says treatments are assigned in a way that does not depend on the potential outcomes. It implies the distribution of potential outcomes in the treated and untreated groups are identical:

\[(Y_i(1),Y_i(0)) \mid\{W_i = 1\} \overset{d}{=} (Y_i(1),Y_i(0)) \mid\{W_i = 0\}.\]

In practice, treatments are often assigned randomly (e.g. by flipping a coin) to satisfy this condition.

Difference-in-means

A simple estimate of the ATE in a randomized experiment is the difference between the (sample) mean outcomes in treated and untreated subjects:

\[\def\DM{\text{DM}} \def\htau{\widehat{\tau}} \htau_{\DM} = \frac{1}{n_1}\sum_{i=1}^nY_i\ones\{W_i = 1\} - \frac{1}{n_0}\sum_{i=1}^nY_i\ones\{W_i = 0\},\]

where $n_w \triangleq \sum_{i=1}^n\ones\{W_i = w\}$ is the number of subjects assigned treatment $w\in{0,1}$. This is called the difference-in-means estimator, and it is motivated by the observation that the (sample) mean outcome in a treatment group is an unbiased estimate of the expected potential outcome in a randomized experiment:

\[\begin{aligned} &\Ex\left[\frac{1}{n_w}\sum_{i=1}^nY_i\ones\{W_i = w\}\right] \\ &\quad= \Ex\big[Y_i\mid W_i = w\big] \\ &\quad= \Ex\big[Y_i(w)\mid W_i = w\big] & & \text{(SUTVA)} \\ &\quad= \Ex\big[Y_i(w)\big] & & \text{(random treatment assignment)}. \end{aligned}\]

In light of this observation, it is not hard to see that the difference-in-means estimator is unbiased:

\[\begin{aligned} \Ex\big[\htau_{\DM}\big] &= \Ex\left[\frac{1}{n_1}\sum_{i=1}^nY_i\ones\{W_i = 1\}\right] - \Ex\left[\frac{1}{n_0}\sum_{i=1}^nY_i\ones\{W_i = 0\}\right] \\ &= \Ex\big[Y_i(1)\big] - \Ex\big[Y_i(0)\big] \\ &= \tau. \end{aligned}\]

We leave as an exercise to show that $\htau_{\DM}$ is asymptotically normal:

\[\sqrt{n_1 + n_0}(\htau_{\DM} - \tau) \dto N(0,\frac{\sigma_1^2}{\pi_1} + \frac{\sigma_0^2}{\pi_0}),\]

where $\sigma_w^2 \triangleq \var\big[Y_i(w)\big]$ and $\pi_w \triangleq \Pr\{W_i = w\}$ for $w\in\{0,1\}$. This result allows us to form confidence intervals and test hypothesis regarding the ATE.

Posted on December 11, 2021 from Ann Arbor, MI