Question: Would like to check how to display standard error from regression trees?

Question

Question: Would like to check how to display standard error from regression trees?

leeh356 opened this issue 7 years ago · comments

Hi, I have a decision tree created from this Causal tree package and need help in displaying the standard error in each of the decision tree nodes. I tried to use print(decision_tree) and summary(decision_tree). The print function shows the node), split, n , deviance and yvalue but not standard error. For the summary, it shows the cross validation error and xstd. Is there any function which allows display of the standard deviation in the decision trees nodes? Appreciate your advice here thanks!

susanathey · Answer 1 · Thu Jan 19 2017 15:36:44 GMT+0800 (China Standard Time)

in the test directory, the test_causalTree file has sample code at the end to do this. Here is a snippet:

easy trick to get coefficients and standard errors

dataTrain$leaves <- predict(tree_dishonest_CT_prune, newdata=dataTrain, type = 'vector')

dataEst$leaves <- predict(tree_dishonest_CT_prune, newdata=dataEst, type = 'vector')

dataTest$leaves <- predict(tree_dishonest_CT_prune, newdata=dataTest, type = 'vector')

dataTrain$leavesf <- factor(round(dataTrain$leaves,4))

dataEst$leavesf <- factor(round(dataEst$leaves,4))

dataTest$leavesf <- factor(round(dataTest$leaves,4))

run regressions with indicators for the leaves interacted with the treatment indicator

if (length(levels(dataTrain$leavesf)) == 1){

modelTrain <- lm(y~w, data=dataTrain)

modelEst <- lm(y~w, data=dataEst)

modelTest <- lm(y~w, data=dataTest)

summary(modelTrain)

summary(modelEst)

summary(modelTest)

} else{

modelTrain <- lm(y~-1+leavesf+leavesf*w-w, data=dataTrain)

modelEst <- lm(y~-1+leavesf+leavesf*w-w, data=dataEst)

modelTest <- lm(y~-1+leavesf+leavesf*w-w, data=dataTest)

print("Leaf names match estimated treatment effects on training set")

print(summary(modelTrain))

print("Estimated treatment effects on estimation set typically more moderate than training set")

print(summary(modelEst))

print("Estimated treatment effects on test set typically more moderate than training set")

print(summary(modelTest))