Getting leaf assignments for new data
bquistorff opened this issue · comments
To do split-sample estimation, we fit the tree structure on, say, data1 and then estimate treatment effects by leaf on data2. Is there an easy way to get the leaf assignments for data2? The leaf assignments for the original tree are in tree$where
, but those don't seem to be updated by estimate.causalTree
(and the $where
field from honest.causalTree
appear to also be for the tree-fit data). rpart
doesn't expose this easily either, but a work around was noted here.
where2 = rpart:::pred.rpart(tree1, rpart:::rpart.matrix(data2))
Is there some better way to do this?
The easiest thing to do is to use the predict command on the tree you get from estimate.causalTree, and then create a factor variable, like this:
dataTest$leaff <- as.factor(round(predict(tree_honest_prune,newdata=dataTest,type="vector"),4))
This is not a perfect workaround--if you have two leaves that have exactly the same estimates, but that is unlikely unless the outcome is binary.
We can work on updating the leaf assignments in tree$where.