Defining counterfactual dynamic treatment nodes with multiple dummy exposure

Question

Defining counterfactual dynamic treatment nodes with multiple dummy exposure

Soudi00 opened this issue 7 years ago · comments

Multivaraite_TRT__4_dummy_exposure_.docx

Install stremr Package Version: 0.8.99 and get Libraries


# ----------------------------------------------------------------------
# Instal stremr Version 0.8.99 Data
# ----------------------------------------------------------------------
#knitr::opts_chunk$set(echo = TRUE)

library(devtools)
#install_github("osofr/stremr", ref = "experimental_master")

# ----------------------------------------------------------------------
# Instal stremr Version 0.8.99 Data
# ----------------------------------------------------------------------
library(stremr)
library(data.table)
library(magrittr)
library(h2o)
options(stremr.verbose=TRUE)
sessionInfo()

Get Source Data from another Github Repository

library(repmis)

source_data("https://github.com/Soudi00/Multi-Treatment-Causal-Modeling/blob/master/sampleAD.RData?raw=True")

AD = as.data.table(AD, key= c(ID,SEQ))

#I have 4 different treatment , should I only use 3 of the dummies in the importData or I should include all of them?


# ----------------------------------------------------------------------
# Define intervention (always on TRT1):
# ----------------------------------------------------------------------
AD[, ("zero.set1") := 0L]
AD[, ("zero.set2") := 0L]
AD[, ("zero.set3") := 0L]
AD[, ("TRT1.set") := 1L]

# ----------------------------------------------------------------------
# Import Data in to stremr object
# ----------------------------------------------------------------------
OData.1  <-  importData(AD, ID = "ID", t_name = "SEQ", 
                        covars = c("CAT_VAR1","CAT_VAR2","CONT_VAR1"),           
                        CENS = c("CNS","ADM_CNS"), 
                        TRT = c("TRT1","TRT2","TRT3","TRT4"),
                        MONITOR = NULL, OUTCOME = "STATUS",
                        weights = NULL, remove_extra_rows = TRUE,
                        verbose = getOption("stremr.verbose"))

# ----------------------------------------------------------------------
# Look at the input data object
# ----------------------------------------------------------------------
print(OData.1)

# ----------------------------------------------------------------------
# Access the input data
# ----------------------------------------------------------------------
get_data(OData.1)

# ----------------------------------------------------------------------
# Model the Right Censroing and Adminstrative Censoring and Exposure
# ----------------------------------------------------------------------
gform_CENS <- "CNS + ADM_CNS ~ CAT_VAR1 + CONT_VAR1"
gform_TRT = "TRT1+TRT2+TRT3+TRT4 ~ CAT_VAR1 + CAT_VAR2 + CONT_VAR1"

# ----------------------------------------------------------------------
# Fit Propensity Scores
# ----------------------------------------------------------------------

OData.1 <- fitPropensity(OData.1, gform_CENS = gform_CENS,ngform_TRT = gform_TRT )

What should be the dimension of the intervened_TRT when we are using multiple dummy treatment

I have my own defined dynamic treatment patterns of interest (5 dummy variables for the 5 patterns). That is:

Always TRT1 (PATH1)
Always TRT2 (PATH2)
Always TRT3 (PATH3)
Start TRT1, switch at any time to TRT3 (PATH4)
Start TRT2, switch at any time to TRT3 (PATH5)

# ----------------------------------------------------------------------
#  Error: length(intervened_NODE) not equal to length(NodeNames)
# ----------------------------------------------------------------------

wts.DT.1 <- getIPWeights(OData = OData.1, intervened_TRT="PATH1")

# ----------------------------------------------------------------------
# Error in modelfit.g$getPsAsW.models()[[i]] : subscript out of bounds
# ----------------------------------------------------------------------

wts.DT.1 <- getIPWeights(OData = OData.1, intervened_TRT=c("TRT1.set","zero.set1","zero.set2","zero.set3"))

# ----------------------------------------------------------------------
# useing diffrent intervened_TRT didnt make a diffrence in the result
# ----------------------------------------------------------------------

wts.DT.1 <- getIPWeights(OData = OData.1, useonly_t_TRT="PATH1==1",rule_name ="Only TRT1")
wts.DT.1

wts.DT.2 <- getIPWeights(OData = OData.1, useonly_t_TRT="PATH2==1", rule_name = "Only TRT2")
wts.DT.2

Oleg Sofrygin · Answer 1 · Sun Dec 10 2017 07:25:31 GMT+0800 (China Standard Time)

Hi @Soudi00, the following code fixes almost all the problems in your code. Except for GBMs with h2o, those still do not work, but that I think is a bug in sl3. I'll try to look into it later on. Cheers, I am closing the issue now, if this code doesn't run for you, feel free to re-open the issue and provide specific instructions on how to replicate that error.

IPW_for_Categorical_Exposure_with_4_Levels.docx

Oleg Sofrygin · Answer 2 · Sun Dec 10 2017 07:38:08 GMT+0800 (China Standard Time)

Just saw the rest of your post.

Your alternative strategy for defining the treatment node with a bunch of binary dummies is also perfectly legitimate. In fact it seems like you were able to fit the propensity scores for the treatment in that case. When you are using the categorical treatment this is exactly what happens under the hood, the categorical is automatically factorized into binary dummies and separate logistic regression model is fit to each dummy.

The intervention treatment nodes (columns) always have to be of the same dimensionality as the treatment itself. However, I am not sure why your second example is failing, it appears to be a real bug. I am looking into it now.

Oleg Sofrygin · Answer 3 · Sun Dec 10 2017 08:02:53 GMT+0800 (China Standard Time)

OK,

The other issue identified by you was indeed a bug in stremr. Of the 4 interventions provided above only one makes sense and only one is allowed by stremr. That is,

wts.DT.1 <- getIPWeights(OData = OData.1, intervened_TRT=c("TRT1.set","zero.set1","zero.set2","zero.set3"))

It means that if I have a treatment node defined by four variables (four columns), then all my interventions on treatment MUST be defined by 4-dimensional intervention node (i.e., by four counterfactual column values).

The code above was supposed to work, but wasn't due to a bug. This is fixed now, please re-install an updated stremr from the master branch. The above code worked for me.

Alternatively, you could define your treatment with a single categorical variable (as in your previously attempted approach described in .doc file). In that case your intervention node (counterfactual) would also have to be a single column that assigns the counterfactual treatment value 1-4.

Oleg Sofrygin · Answer 4 · Sun Dec 10 2017 08:08:57 GMT+0800 (China Standard Time)

Finally, if you would like to look at your weights based on the propensity scores, please look at the dataset returned by getIPWeights :

> wts.DT.1
      ID SEQ STATUS      g0.A      g0.C g0.N     g0.CAN gstar.C gstar.A gstar.N gstar.CAN wt.by.t cum.IPAW
  1:   1   1      0 0.4054166 0.9674897    1 0.39223636       1       0       1         0       0        0
  2:   1   2      0 0.3585701 0.9590887    1 0.34390048       1       0       1         0       0        0
  3:   1   3      0 0.4708409 0.9764564    1 0.45975562       1       0       1         0       0        0
  4:   1   4      1 0.3425051 0.9557057    1 0.32733405       1       0       1         0       0        0
  5:   2   1      0 0.6816217 0.7165851    1 0.48843991       1       0       1         0       0        0
 ---                                                                                                      
302: 100   1      0 0.2799980 0.8564319    1 0.23979923       1       0       1         0       0        0
303: 100   2      0 0.2955584 0.8757783    1 0.25884365       1       0       1         0       0        0
304: 100   3      0 0.5260467 0.8543336    1 0.44941938       1       0       1         0       0        0
305: 100   4      0 0.4804644 0.8784357    1 0.42205709       1       0       1         0       0        0
306: 100   5      0 0.5419484 0.1156405    1 0.06267119       0       0       1         0       0        0
     N.follow.rule cum.stab.P                           rule.name
  1:             8       0.08 TRT1.setzero.set1zero.set2zero.set3
  2:             9       0.09 TRT1.setzero.set1zero.set2zero.set3
  3:             4       0.04 TRT1.setzero.set1zero.set2zero.set3
  4:             0       0.00 TRT1.setzero.set1zero.set2zero.set3
  5:             8       0.08 TRT1.setzero.set1zero.set2zero.set3
 ---                                                             
302:             8       0.08 TRT1.setzero.set1zero.set2zero.set3
303:             9       0.09 TRT1.setzero.set1zero.set2zero.set3
304:             4       0.04 TRT1.setzero.set1zero.set2zero.set3
305:             0       0.00 TRT1.setzero.set1zero.set2zero.set3
306:             0       0.00 TRT1.setzero.set1zero.set2zero.set3

The variable cum.IPAW is the cumulative propensity-score weight. It is clearly 0 for almost all observations, which tells me that almost no-one is following your defined rule of interest. I.e., there are almost no observations / subjects in your data whose observed 4 treatment node values are equal to your counterfactual 4 node values (i.e., the values in the intervention columns given by "TRT1.set","zero.set1","zero.set2","zero.set3"). This tells me that the interventions you are considering are most likely ill-defined or incorrectly defined.

wts.DT.1[["cum.IPAW"]]
  [1]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [10]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [19]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [28]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [37]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [46]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   3.721010   4.888417   0.000000
 [55]   5.504749  27.974948   2.038891   4.114123   8.337936   5.430169  27.265265   4.255394   0.000000
 [64]   3.629561  13.035811   3.154115   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [73]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [82]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
 [91]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[100]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[109]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[118]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[127]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[136]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[145]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[154]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[163]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[172]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[181]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[190]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[199]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[208]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[217]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[226]   0.000000   0.000000   0.000000   0.000000   0.000000   3.238716  10.429098   0.000000   3.271420
[235]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[244]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   3.222978  10.387589   0.000000
[253]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000  11.958944 136.518823   0.000000
[262]   4.132540   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[271]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[280]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[289]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
[298]   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000   0.000000
>