skops-dev / skops

skops is a Python library helping you share your scikit-learn based models and put them in production

Home Page:https://skops.readthedocs.io/en/stable/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make sure xgboost.XGBClassifier and xgboost.XGBRegressor can be persisted

adrinjalali opened this issue · comments

To increase reach, we should make sure the sklearn compatible xgboost estimators can be persisted.

This issue is to:

  • check if we can persist them out of the box
  • maybe add a test and to the CI to make sure they can be persisted
  • if needed, figure out what we'd need to implement on our side, or on the xgb side, to make this happen

Okay, I did a quick check with the 3 big ones:

  1. XGBoost fails.
  2. Catboost fails.
  3. LightGBM: actually works for classifier and regressor!

Obviously, there are so many options with these estimators that a quick test can't cover all of them, so there might be cases where different results will occur.

Here is the code I used for testing LightGBM. Similar code was used for the others.

# python -m pip install lightgbm
from lightgbm.sklearn import LGBMClassifier, LGBMRegressor
import numpy as np
from skops import io

X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3

regr = LGBMRegressor().fit(X,y)
lr = io.loads(io.dumps(regr), trusted=[
    'collections.defaultdict', 'lightgbm.basic.Booster', 'lightgbm.sklearn.LGBMRegressor'])
lr.predict(X)

clf = LGBMClassifier().fit(X,y)
lc = io.loads(io.dumps(clf), trusted=[
    'collections.defaultdict', 'lightgbm.basic.Booster', 'lightgbm.sklearn.LGBMClassifier',
    'numpy.int64', 'sklearn.preprocessing._label.LabelEncoder'])
lc.predict_proba(X)