warning in when block() in rd_design(): damod_mm longer than w

Question

warning in when block() in rd_design(): damod_mm longer than w

kirkvanacore opened this issue a year ago · comments

@adamSales and I received a warning when requesting the summary of a limit object produced by a blocked rd design. The circumstances are detailed below:

When running this code:

des <- rd_design(Z ~ forcing(R) + unitid(id) + block(problem_id), data=ad[ad$R > -1 & ad$R < 11, ]) 
m1_bw2<-glm(Y ~ R + Z, data = ad[ad$R > -1 & ad$R < 11, ], family = binomial)
res_BW2_1 <- lmitt(Y~1,design=des,offset=cov_adj(m1_bw2), weights = "ate", data=ad[ad$R > -1 & ad$R < 11, ])
summary(res_BW2_1)

...we receive this warning along with the summary:

Warning message:
In damod_mm[msk, , drop = FALSE] * w :
longer object length is not a multiple of shorter object length

The error does not occur without the block(problem_id)
We suspect that this warring is refereeing to a misalignment between the dimensions of damod_mm and the length of the weights

Josh Errickson · Answer 1 · Wed Jun 28 2023 04:09:59 GMT+0800 (China Standard Time)

I suspect @jwasserman2 may be the better person to help debug this, but do you have missing data? This has been something that's come up occasionally, especially with Adam's real data, that we didn't account for properly.

Kirk Vanacore · Answer 2 · Wed Jun 28 2023 04:16:48 GMT+0800 (China Standard Time)

There are no missing data for the variables used in this example.

Adam Sales · Answer 3 · Wed Jun 28 2023 05:03:06 GMT+0800 (China Standard Time)

There may be some blocks with 0 weights, though

…

On Tue, Jun 27, 2023 at 3:17 PM Kirk Vanacore ***@***.***> wrote: There are no missing data for the variables used in this example. — Reply to this email directly, view it on GitHub <#131 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADRHA7BGHJOMBWVMODWDUBDXNM5TXANCNFSM6AAAAAAZWCP5AU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Ben · Answer 4 · Wed Jun 28 2023 05:52:38 GMT+0800 (China Standard Time)

It would be great if you could share either the data. Can you, @kirkvanacore ? Or, better yet, a stripped-down, privacy preserving, data usage ageement compliant version of it that still manifests the warning?

Kirk Vanacore · Answer 5 · Wed Jun 28 2023 06:19:54 GMT+0800 (China Standard Time)

Here is synthetic data set that produces the error:

synth_dat_issue131.csv

jwasserman2 · Answer 6 · Wed Jun 28 2023 07:30:51 GMT+0800 (China Standard Time)

Thanks for posting the data @kirkvanacore.ate() is returning NA's, causing rows to be dropped. In .get_a21(), this results in w, which is x$weights, to be of a shorter length than damod_mm, which has been created by passing na.pass to model.frame():

> nrow(ad[ad$R > -1 & ad$R < 11, ])
[1] 12283
> nrow(model.frame(res_BW2_1))
[1] 11930
> sum(is.na(ate(des, data = ad[ad$R > -1 & ad$R < 11, ])))
[1] 353
> 12283 - 11930
[1] 353

Ben · Answer 7 · Wed Jun 28 2023 11:24:03 GMT+0800 (China Standard Time)

Thanks, all. I wonder what characterizes the blocks with NA weights?

Josh Errickson · Answer 8 · Wed Jun 28 2023 21:28:02 GMT+0800 (China Standard Time)

Blocks are numeric through 252, but only 226 exist - e.g. there is no block 41 or 42. When we expand e_z (the block-level ratio of #treated/total num) to the observation level, we use e_z[blocks(design)[, 1]] (R/weights.R#L132). However, if for example we're looking at block 245, e_z[245] returns the 245rd element of a 226-length vector, NA. What we want is e_z["245"], to return the named entry in the vector.

Solution could be as easy as e_z[as.character(blocks(design)[,1])], but I don't have time right now to test it. I'll try and get to it this afternoon if no one else does.

Josh Errickson · Answer 9 · Thu Jun 29 2023 02:13:02 GMT+0800 (China Standard Time)

Fix was as easy as expected. @kirkvanacore I no longer get the warning with the synthetic data; please test with your real data and let me know.

Ben · Answer 10 · Thu Jun 29 2023 02:35:40 GMT+0800 (China Standard Time)

When you do get to testing this w/ the real data, @kirkvanacore, please also check whether the lmitt(<...>, absorb=T) problem has been fixed as well. Josh E suspects that 5ffed0d4 will have taken care of it.

Ben · Answer 11 · Thu Jul 27 2023 17:41:23 GMT+0800 (China Standard Time)

Hi @kirkvanacore could you check against the real data and verify that you no longer get a warning (or other sign of trouble)? If not, this issue can be closed.

Kirk Vanacore · Answer 12 · Thu Aug 03 2023 03:46:02 GMT+0800 (China Standard Time)

@benthestatistician @josherrickson My apologies for the delay. I no longer receive the error when running lmitt(<...>, absorb=T) against the real data.

Ben · Answer 13 · Thu Aug 03 2023 19:08:46 GMT+0800 (China Standard Time)

Thanks Kirk!