tadhgpearson / shap4j-data-converter

Converting tree ensemble model dumps as shap4j data files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

shap4j-data-converter

Data converter for shap4j, which converts tree ensemble models trained by XGBoost, LightGBM, CatBoost, scikit-learn and pyspark to .shap4j data files.

Getting started

Installation

 $ python setup.py install

Dump tree model through pickle

XGBoost

Here, we borrow the example in the shap documentation

import xgboost
import shap
import pickle

# train XGBoost model
X,y = shap.datasets.boston()
model = xgboost.train({"learning_rate": 0.01}, xgboost.DMatrix(X, label=y), 100)

with open("boston.pkl", "wb+") as f:
    pickle.dump(model, f)

Convert an existing .pkl file with command-line tool

We can use the shap4jconv program to convert the boston.pkl pickle file generated by the previous example to a .shap4j data file:

 $ shap4jconv boston.pkl

The command above generates the output file as boston.shap4j

Alternatively, you could specify the output file via the --output parameter:

 $ shap4jconv --output model.shap4j --overwrite boston.pkl

where the --overwrite parameter allows shap4jconv to overwrite an existing file.

Convert .pkl file programatically

You can also use shap4j-data-converter in your Python program by simply importing the Shap4jDataConverter class from the shap4jconv package, for example:

from shap4jconv import Shap4jDataConverter

converter = Shap4jDataConverter()
converter.convert("boston.pkl", output_file="dumped_boston.pkl", overwrite=True)

Use .shap4j data file in Java

See the Java example of shap4j for how to integrate SHAP into your JVM projects using the data files generated above.

About

Converting tree ensemble model dumps as shap4j data files

License:MIT License


Languages

Language:Python 100.0%