haifengl / smile

Statistical Machine Intelligence & Learning Engine

Home Page:https://haifengl.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Tree Representation for Regression Models in Google Earth Engine

thomaslauber opened this issue · comments

I am a researcher that works extensively with Google Earth Engine (GEE) and smile made a huge contribution to my work and to the work of many others! Thanks so much for that!

Currently, I am working on implementing a python package that will download a tree-based model trained in GEE and fully replicate it in sklearn, which will allow researchers to compute SHAP values locally and help better understand the model's behaviour.
I am in contact with Noel Gorelick, the chief software engineer at GEE, and we thought this could be a very valuable contribution.

So far, the package works fine for Random Forests and a Decision Trees in Classification mode thanks to the (I think) compressed order representation. Is there a way to include this representation in the Regression Tree as well?
So, the ideal solution would be if the Regression Models would behave in the exact same way as the Classification models.
This would be needed to be implemented in V1 of smile, which seems to be the current version GEE is using.

Here attached some screenshots that hopefully visually explain the problem:

Output of a Classification Tree in GEE:
image
(one can see the tree representation at the bottom)

Output of a Random Forest in Classification mode:
image
(one can see the tree representation at the bottom for both trees inside the forest)

Output of a Regression Tree in GEE:
image
(one can see there is no tree representation)

Output of a Random Forest in Regression mode:
image
(also here, there is no tree representation)

Thanks. Smile v2+ already supports tree representation in compact text and graphviz format for both classification and regression trees.