## R Implementation of the 2-Stage Regression Estimator of the Structural Nested Mean Model for Examining Time-varying Causal Effect Moderation

#### Accompanying Articles

This R code accompanies the following journal articles:

• Almirall D, Ten Have T, Murphy SA. (2009) Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Biometrics. Volume 66, Issue 1, p.131-139, March.
• Almirall D, McCaffrey DF, Ramchand R, Murphy SA. (submitted) Subgroups Analysis when Treatment and Moderators are Time-varying.

#### Overview

This R code implements a simple version of the two-stage regression with residuals estimator (Almirall, et al. (2009)) of Robins' Structural Nested Mean Model (SNMM). It returns an estimate of the asymptotic variance-covariance matrix (and therefore standard errors) for the estimated SNMM parameters (stage-2), taking into account sampling error in the estimation of the stage-1 parameters. The SNMM, as described in Almirall, et al.(2009), is useful when scientists are interested in understanding causal effect moderation in settings in which the treatment (or primary exposure) of interest is time-varying and so are covariates hypothesized to moderate its effects.

#### What is R?

R is a language and environment for statistical computing and graphics. You will need to install R in order to use the twostage.snmm() function. R can be downloaded for free by visiting the R-Project Home Page.

This file contains the R function twostage.snmm() which implements the 2-stage regression estimator. The header at the beginning of this file explains how to specify each of the function arguments. These include specifying the stage 1 models for regressions of the moderator (one for each moderator at each time point) given the past (modgivpast argument), specifying additional terms (coefnuisfunc argument) which are multiplied with residuals from the models of the moderator given the past to form models for the (nuisance) error terms corresponding to each moderator, and specifying models for the intermediate causal effect functions (intermeff argument).

#### A Worked Example

In the following, we describe how we fit the SNMM discussed in the article by Almirall D, McCaffrey DF, Ramchand R, Murphy SA. (submitted), cited above, using the 2-stage regression with residuals estimator as implemented in twostage.snmm(). We cannot provide you with the data for you to carry out this exact example, but after following the code and description below alongside the paper, you should be prepared to use twostage.snmm() in your own research. In the article, we described how to use a SNMM to examine two scientific questions: (1) "What is the impact of receiving treatment during months 1-3 versus not receiving treatment during months 1-3 (nor during months 7-9) on end-of-study environmental risk outcomes as a function of baseline severity?" and (2) "What is the impact of receiving treatment during months 7-9 versus not receiving treatment during months 7-9 on end-of-study environmental risk outcomes as a function of baseline severity, treatment received between 1-3 months, and severity during months 4-6?" A description of the variables (s0,a1,s1,a2,y) in the data set dat1 used in the illustrative example follows:

s0
binary variable measuring baseline (month 0) severity: 0=low severity, 1=high severity
a1
binary variable measuring receipt of substance use treatment at some point during months 1-3: 0=no, 1=yes
s1
binary variable measuring severity during the 4-6 month interval : 0=low severity, 1=high severity
a2
binary variable measuring receipt of substance use treatment at some point during months 7-9: 0=no, 1=yes
y
continuous outcome variable measuring environmental risk for substance use (larger values of y indicates poor outcomes)
The following code snippet was used to estimate the saturated SNMM presented in the article using the 2-stage estimator (text to the right of the # sign are comments designed to help the reader further connect the R code with Equation 4 in the paper):


snmm1 <- twostage.snmm(
modgivpast   = list(
list(s0 ~ 1, 0),                # used to estimate \delta_{S_0}
list(s1 ~ s0 + a1 + a1:s0, 0)   # used to estimate \delta_{S_1}
),
coefnuisfunc = list(
~1,                             # \eta_1
~s0 + a1 + a1:s0                # \eta_2 + \eta_3 s0 + \eta_4 s0 a1
),
intermeff    = list(
list(~a1, ~s0 ),                # model for \mu_1: \beta_{10} + \beta_{11} s0
list(~a2, ~s0 + s1 + s0:s1 +    # analogous model for \mu_2
a1 + s0:a1 + a1:s1 +
s0:a1:s1)
),
response     = ~y,                       # the outcome
dat          = dat1,                     # the data set
verbose      = T,
full.return  = T,
nmods.vec    = c(1,1)                    # 1 moderator at time 1; 1 moderator at time 2
)


The code above produced the following output, which includes the results reported in Table 1 of the article:


*********************************************************
2-Stage Estimator of the Structural Nested Mean Model
*********************************************************

Outcome...................................... y
Sample size.................................. 2870
Number of Time-points........................ 2
Number of Stage-2 Parameters................. 16
Number of Stage-2 Nuisance Parameters........ 5
Number of Causal Parameters, plus Intercept.. 11

---------------------------------------------------------
Marginal mean response under no treatment
-----------------------------------------
Estimate    ASE z value Pr(>|z|)
(SNMM Intercept)       39.715 1.3359 29.7287        0

---------------------------------------------------------
Intermediate causal effect function at time 1
----------------------------------------------
Treatment variable: a1
Estimate    ASE z value Pr(>|z|)
(Causal 1 Intercept)  -2.6850 1.5831 -1.6960   0.0899
s0                    -4.9193 3.0396 -1.6184   0.1056

---------------------------------------------------------
Intermediate causal effect function at time 2
----------------------------------------------
Treatment variable: a2
Estimate     ASE z value Pr(>|z|)
(Causal 2 Intercept)  -6.6851  4.2766 -1.5632   0.1180
s0                     2.7402  5.7166  0.4793   0.6317
s1                    22.3874 17.2420  1.2984   0.1941
s0:s1                -28.1620 19.1436 -1.4711   0.1413
a1                     1.2861  4.4446  0.2894   0.7723
s0:a1                 -0.2535  6.0471 -0.0419   0.9666
s1:a1                -27.0626 17.6041 -1.5373   0.1242
s0:s1:a1              35.7499 19.8079  1.8048   0.0711

---------------------------------------------------------
Non-causal nuisance terms at time 1
------------------------------------
De-meaned moderator: e.s0
Estimate    ASE z value Pr(>|z|)
(Nuis (1,1) Intcpt)   7.3954 2.8876  2.5611   0.0104

---------------------------------------------------------
Non-causal nuisance terms at time 2
------------------------------------
De-meaned moderator: e.s1
Estimate    ASE z value Pr(>|z|)
(Nuis (2,1) Intcpt)  15.6626 4.1926  3.7357   0.0002
s0                    1.5247 6.3846  0.2388   0.8113
a1                   -0.4813 4.5343 -0.1061   0.9155
s0:a1                -6.0164 6.7970 -0.8852   0.3761


From the output above, we can see that the estimated effect of a1 on y among the subgroup of individuals who had low severity at baseline (s0=0) is -2.69 (ASE=1.58; p-value=0.09). Similarly, the estimated effect of a2 on y among the subgroup of individuals who had low severity at baseline (s0=0), did not receive treatment between months 1-3 (a1=0), and had low severity between months 4-6 (s1=0) is -6.69 (ASE=4.28; p-value=0.12). The effect of a1 on y for s0=1 versus s0=0 differs by -4.92 (p-value=0.11), suggesting that s0 is a moderator of the effect of a1 on y. (Since the data in this example arise from an observational study, we remind the reader (as we do in the article) that in order to interpret the estimated intermediate effects above as causal, we must satisfy the No Direct Confounders Assumption (see Appendix A of the article). This assumption is likely not met in this illustrative example since there may be additional variables other than s(t-1) that may be directly related to both a(t) and y. Additional baseline and time-varying, continuous and discrete, covariates could be included in the model fitted above to achieve this assumption, but we omit doing this here for expositional simplicity.)

The fitted object snmm1 contains additional information, including the estimates from the stage 1 regression (in snmm1$stage1), the estimated asymptotic variance-covariance matrix of the stage 2 parameters (in snmm1$vcov.theta), and a more succinct summary of the output above (in snmm1$ans). Inference Concerning Linear Combinations of the SNMM Parameters: You can use snmm1$vcov.theta to make inference about linear combinations of the parameters of interest (e.g., What is the effect of a1 on y among the subgroup of individuals who had high severity at baseline (s0=1)?). You will need to know in what order the stage 2 parameters are stored in the variance-covariance matrix (the ordering may be different from the order in which they are presented in the output above) to know how which parameters to pick out for the linear combinations of interest (see L below); you can get the order by looking at snmm1$ans. The following code snippet returns estimates for all possible contrasts (linear combinations) of interest, p-values for the test that the contrasts are zero, 95% confidence intervals, and effect sizes for each of the effects:  L <- rbind( c(0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0), c(0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0), c(0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0), c(0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0), c(0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0), c(0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0), c(0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0), c(0,0,0,1,0,1,0,1,0,1,0,0,0,0,0,0), c(0,0,0,1,1,1,1,0,0,0,0,0,0,0,0,0), c(0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0) ) Lnames <- c("effect of a1 (no future txt), among s0=0", "effect of a1 (no future txt), among s0=1", "effect of a2, among s0=0, a1=0, s1=0", "effect of a2, among s0=0, a1=1, s1=0", "effect of a2, among s0=1, a1=0, s1=0", "effect of a2, among s0=1, a1=1, s1=0", "effect of a2, among s0=0, a1=0, s1=1", "effect of a2, among s0=0, a1=1, s1=1", "effect of a2, among s0=1, a1=0, s1=1", "effect of a2, among s0=1, a1=1, s1=1" ) rownames(L) <- Lnames; V <- as.matrix(snmm2.1$vcov.theta)
V.L <-  (L%*%V)%*%t(L); se.L <- sqrt( diag(V.L) )   # variance-cov matrix and standard errors
est.L <- L %*% snmm1$ans[,1] lowCI <- est.L - 1.96*se.L # lower 95% confidence bound uppCI <- est.L + 1.96*se.L # upper 95% confidence bound waldval <- est.L/se.L # wald statistic pval <- 2*pnorm(abs(waldval),lower.tail=F) # p-value for test that linear contrast = 0 sdY <- sd(dat1$y)
effsz <- est.L/sdY                                  # get effect sizes by scaling effects by SD(y)
es.lowCI <- lowCI/sdY
es.uppCI <- uppCI/sdY
res2 <- data.frame(EST=est.L, SE=se.L, CI.LOWER=lowCI, CI.UPPER=uppCI, PVALUE=pval,
EFFSIZE=effsz, ES.CI.LOWER=es.lowCI,ES.CI.UPPER=es.uppCI)
cat("\nLinear Contrasts of Interest\n")
print( round(res2[,1:5],3) ); cat("\n"); print( round(res2[,6:8],3) )


The output is:


Linear Contrasts of Interest
EST     SE CI.LOWER CI.UPPER PVALUE
effect of a1 (no future txt), among s0=0  -2.685  1.583   -5.788    0.418  0.090
effect of a1 (no future txt), among s0=1  -7.604  2.595  -12.690   -2.519  0.003
effect of a2, among s0=0, a1=0, s1=0      -6.685  4.277  -15.067    1.697  0.118
effect of a2, among s0=0, a1=1, s1=0      -5.399  1.210   -7.771   -3.026  0.000
effect of a2, among s0=1, a1=0, s1=0      -3.945  3.793  -11.380    3.490  0.298
effect of a2, among s0=1, a1=1, s1=0      -2.912  1.556   -5.963    0.138  0.061
effect of a2, among s0=0, a1=0, s1=1      15.702 16.703  -17.036   48.441  0.347
effect of a2, among s0=0, a1=1, s1=1     -10.074  3.340  -16.620   -3.528  0.003
effect of a2, among s0=1, a1=0, s1=1      -9.719  7.403  -24.229    4.790  0.189
effect of a2, among s0=1, a1=1, s1=1       0.000  3.292   -6.451    6.452  1.000

EFFSIZE ES.CI.LOWER ES.CI.UPPER
effect of a1 (no future txt), among s0=0  -0.128      -0.276       0.020
effect of a1 (no future txt), among s0=1  -0.362      -0.604      -0.120
effect of a2, among s0=0, a1=0, s1=0      -0.318      -0.717       0.081
effect of a2, among s0=0, a1=1, s1=0      -0.257      -0.370      -0.144
effect of a2, among s0=1, a1=0, s1=0      -0.188      -0.542       0.166
effect of a2, among s0=1, a1=1, s1=0      -0.139      -0.284       0.007
effect of a2, among s0=0, a1=0, s1=1       0.748      -0.811       2.306
effect of a2, among s0=0, a1=1, s1=1      -0.480      -0.791      -0.168
effect of a2, among s0=1, a1=0, s1=1      -0.463      -1.154       0.228
effect of a2, among s0=1, a1=1, s1=1       0.000      -0.307       0.307


The results suggest that treatment reduces environmental risk for substance use y except in the following two subgroups in which the impact of treatment during months 7-9 was not statistically significant: in the (s0,a1,s1)=(0,0,1) subgroup, the estimated effect of treatment was harmful and large (p-value=0.35); and in the (s0,a1,s1)=(1,1,1) subgroup, the estimated effect in this group was zero.

Considering a More Parsimonious Model: The saturated SNMM that was fit above (snmm1) (and considered in the article) was chosen for expositional simplicity, i.e., to illustrate the methodology. However, investigators may want to consider estimating a more parsimonious SNMM with fewer parameters. In the model fitted above (snmm1), a variety of parameters were not significant, even at conservative levels (say, using a Type-I error rates of 15%). While formal methods for model selection have not yet been developed and evaluated specifically for the SNMM using the 2-stage regression estimator, we can employ the popular, ad-hoc (but useful) approach of removing terms with small to moderate Wald statistics (say, with |z value| < 1.0). We employ this approach in the following reduced-form model fit:


snmm2 <- twostage.snmm(
modgivpast   = list(
list(s0 ~ 1, 0),
list(s1 ~ s0 + a1 + a1:s0, 0)
),
coefnuisfunc = list(
~1,
~1                                # note reduced-form
),
intermeff    = list(
list(~a1, ~s0 ),
list(~a2, ~s1 + s0:s1 + a1:s1 +   # note reduced-form
s0:a1:s1)
),
response     = ~y,
dat          = dat1,
verbose      = T,
full.return  = T,
nmods.vec    = c(1,1)
)


Note that this SNMM is no longer saturated, as we are now making modeling assumptions on the data--that is, there are now fewer parameters than there are subgroup combinations of (s0,a1,s1,a2). The output from this model fit was:


*********************************************************
2-Stage Estimator of the Structural Nested Mean Model
*********************************************************

Outcome...................................... y
Sample size.................................. 2870
Number of Time-points........................ 2
Number of Stage-2 Parameters................. 10
Number of Stage-2 Nuisance Parameters........ 2
Number of Causal Parameters, plus Intercept.. 8

---------------------------------------------------------
Marginal mean response under no treatment
-----------------------------------------
Estimate    ASE z value Pr(>|z|)
(SNMM Intercept)      39.4425 1.2017 32.8219        0

---------------------------------------------------------
Intermediate causal effect function at time 1
----------------------------------------------
Treatment variable: a1
Estimate    ASE z value Pr(>|z|)
(Causal 1 Intercept)  -2.5743 1.4744 -1.7460   0.0808
s0                    -4.4462 2.6738 -1.6629   0.0963

---------------------------------------------------------
Intermediate causal effect function at time 2
----------------------------------------------
Treatment variable: a2
Estimate     ASE z value Pr(>|z|)
(Causal 2 Intercept)  -4.2998  0.9303 -4.6221   0.0000
s1                    22.2760 16.3573  1.3618   0.1732
s1:s0                -25.2739 17.6340 -1.4333   0.1518
s1:a1                -26.3297 16.5613 -1.5898   0.1119
s1:s0:a1              31.4240 18.1437  1.7320   0.0833

---------------------------------------------------------
Non-causal nuisance terms at time 1
------------------------------------
De-meaned moderator: e.s0
Estimate    ASE z value Pr(>|z|)
(Nuis (1,1) Intcpt)   7.6182 2.5355  3.0046   0.0027

---------------------------------------------------------
Non-causal nuisance terms at time 2
------------------------------------
De-meaned moderator: e.s1
Estimate    ASE z value Pr(>|z|)
(Nuis (2,1) Intcpt)  13.3313 1.0962 12.1618        0


And the subgroup specific effects (linear combinations of interest) are now:


Linear Contrasts of Interest
EST     SE CI.LOWER CI.UPPER PVALUE
effect of a1 (no future txt), among s0=0 -2.574  1.474   -5.464    0.316  0.081
effect of a1 (no future txt), among s0=1 -7.021  2.233  -11.397   -2.644  0.002
effect of a2, among s0=0, a1=0, s1=0     -4.300  0.930   -6.123   -2.476  0.000
effect of a2, among s0=0, a1=1, s1=0     -4.300  0.930   -6.123   -2.476  0.000
effect of a2, among s0=1, a1=0, s1=0     -4.300  0.930   -6.123   -2.476  0.000
effect of a2, among s0=1, a1=1, s1=0     -4.300  0.930   -6.123   -2.476  0.000
effect of a2, among s0=0, a1=0, s1=1     17.976 16.322  -14.016   49.968  0.271
effect of a2, among s0=0, a1=1, s1=1     -8.353  3.118  -14.466   -2.241  0.007
effect of a2, among s0=1, a1=0, s1=1     -7.298  6.777  -20.580    5.985  0.282
effect of a2, among s0=1, a1=1, s1=1     -2.203  3.193   -8.462    4.055  0.490

EFFSIZE ES.CI.LOWER ES.CI.UPPER
effect of a1 (no future txt), among s0=0  -0.123      -0.260       0.015
effect of a1 (no future txt), among s0=1  -0.334      -0.543      -0.126
effect of a2, among s0=0, a1=0, s1=0      -0.205      -0.292      -0.118
effect of a2, among s0=0, a1=1, s1=0      -0.205      -0.292      -0.118
effect of a2, among s0=1, a1=0, s1=0      -0.205      -0.292      -0.118
effect of a2, among s0=1, a1=1, s1=0      -0.205      -0.292      -0.118
effect of a2, among s0=0, a1=0, s1=1       0.856      -0.667       2.379
effect of a2, among s0=0, a1=1, s1=1      -0.398      -0.689      -0.107
effect of a2, among s0=1, a1=0, s1=1      -0.347      -0.980       0.285
effect of a2, among s0=1, a1=1, s1=1      -0.105      -0.403       0.193


The subgroup effects of a1 do not change much in the reduced model: treatment during months 1-3 reduces environmental risk for substance use, and the effect is strongest among subjects exhibiting higher severity at baseline. Concerning the effects of a2: Since we have removed terms from the model for the intermediate causal effect at time t=2, we now see that the impact of a2 on y is the same among certain subgroups of subjects. In particular, the results suggest that s1 is the primary moderator of the impact of a2 on y. There is a beneficial effect of treatment during months 7-9, and it is stronger in general among subjects with high intermediate severity (s1=1) (it almost doubles). As with the saturated SNMM, there are two exceptions: subjects who had high severity at baseline, were treated during months 1-3, and continued to have high severity during months 4-6 (for this subgroup, the effect is beneficial, but small and non-significant); and subjects who at intake had low severity, did not receive treatment during months 1-3, but whose severity was high during months 4-6 (for this group, we see a large, harmful effect of treatment, but it is not significant).

#### Disclaimer

This R code is provided with no guarantees. It is an alpha version, written originally for methodological research purposes only. Therefore, it may not be user-friendly and may be difficult to use. If you find bugs/errors in this code, please email me (dalmiral [arroba] umich [punto] edu): I will fix the code and then acknowledge you on this website.

#### How To Cite This Work

If you use the R code or any of this work in your own research, we would greatly appreciate it if you cite any one (or both) of the articles listed at the top of this web page. Thank you very much!