Make sure xgboost.XGBClassifier and xgboost.XGBRegressor can be persisted
adrinjalali opened this issue · comments
To increase reach, we should make sure the sklearn compatible xgboost estimators can be persisted.
This issue is to:
- check if we can persist them out of the box
- maybe add a test and to the CI to make sure they can be persisted
- if needed, figure out what we'd need to implement on our side, or on the xgb side, to make this happen
Okay, I did a quick check with the 3 big ones:
- XGBoost fails.
- Catboost fails.
- LightGBM: actually works for classifier and regressor!
Obviously, there are so many options with these estimators that a quick test can't cover all of them, so there might be cases where different results will occur.
Here is the code I used for testing LightGBM. Similar code was used for the others.
# python -m pip install lightgbm
from lightgbm.sklearn import LGBMClassifier, LGBMRegressor
import numpy as np
from skops import io
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3
regr = LGBMRegressor().fit(X,y)
lr = io.loads(io.dumps(regr), trusted=[
'collections.defaultdict', 'lightgbm.basic.Booster', 'lightgbm.sklearn.LGBMRegressor'])
lr.predict(X)
clf = LGBMClassifier().fit(X,y)
lc = io.loads(io.dumps(clf), trusted=[
'collections.defaultdict', 'lightgbm.basic.Booster', 'lightgbm.sklearn.LGBMClassifier',
'numpy.int64', 'sklearn.preprocessing._label.LabelEncoder'])
lc.predict_proba(X)