Question: Would like to check how to display standard error from regression trees?
leeh356 opened this issue · comments
Hi, I have a decision tree created from this Causal tree package and need help in displaying the standard error in each of the decision tree nodes. I tried to use print(decision_tree) and summary(decision_tree). The print function shows the node), split, n , deviance and yvalue but not standard error. For the summary, it shows the cross validation error and xstd. Is there any function which allows display of the standard deviation in the decision trees nodes? Appreciate your advice here thanks!
in the test directory, the test_causalTree file has sample code at the end to do this. Here is a snippet:
easy trick to get coefficients and standard errors
dataTrain$leaves <- predict(tree_dishonest_CT_prune, newdata=dataTrain, type = 'vector')
dataEst$leaves <- predict(tree_dishonest_CT_prune, newdata=dataEst, type = 'vector')
dataTest$leaves <- predict(tree_dishonest_CT_prune, newdata=dataTest, type = 'vector')
dataTrain$leavesf <- factor(round(dataTrain$leaves,4))
dataEst$leavesf <- factor(round(dataEst$leaves,4))
dataTest$leavesf <- factor(round(dataTest$leaves,4))
run regressions with indicators for the leaves interacted with the treatment indicator
if (length(levels(dataTrain$leavesf)) == 1){
modelTrain <- lm(y~w, data=dataTrain)
modelEst <- lm(y~w, data=dataEst)
modelTest <- lm(y~w, data=dataTest)
summary(modelTrain)
summary(modelEst)
summary(modelTest)
} else{
modelTrain <- lm(y~-1+leavesf+leavesf*w-w, data=dataTrain)
modelEst <- lm(y~-1+leavesf+leavesf*w-w, data=dataEst)
modelTest <- lm(y~-1+leavesf+leavesf*w-w, data=dataTest)
print("Leaf names match estimated treatment effects on training set")
print(summary(modelTrain))
print("Estimated treatment effects on estimation set typically more moderate than training set")
print(summary(modelEst))
print("Estimated treatment effects on test set typically more moderate than training set")
print(summary(modelTest))