Ibotta / sk-dist

Distributed scikit-learn meta-estimators in PySpark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cannot get answer with LGBMClassifier

jiahengqi opened this issue · comments

I change xgb to lgb but can't get any return

it cost 1sec on GridSearchCV

grid=dict(num_leaves=[8,15,31],
     n_estimators=[100, 200, 300])
for _ in trange(1):
    model_lgb = GridSearchCV(
        LGBMClassifier(),
        grid, n_jobs=4, cv=3
        )
    model_lgb.fit(X,y)

but no return in 10 min with DistGridSearchCV

grid=dict(num_leaves=[8,15,31],
     n_estimators=[100, 200, 300],
         n_jobs=1)
for _ in trange(1):
    model_lgb = DistGridSearchCV(
        LGBMClassifier(),
        grid, sc, cv=3,n_jobs=1
        )
    model_lgb.fit(X,y)

Do you have LightGBM installed on all of the nodes of the cluster? Including the required bindings (https://pypi.org/project/glibc/)? This will all need to be installed using a node bootstrap.

We've never tested LightGBM with sk-dist. It could work in theory but sk-dist doesn't formally support it.

We've added to the documentation around LightGBM: https://github.com/Ibotta/sk-dist#gradient-boosting