stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error when tuning NGBSurvival with GridSearchCV

rhjohnstone opened this issue · comments

Minimal example:

from ngboost import NGBSurvival
import numpy.random as npr
from sklearn.model_selection import GridSearchCV

X = npr.randn(100, 5)
T = npr.rand(100)
E = npr.randint(2, size=100)

param_grid = {"learning_rate": [0.01, 0.1]}

ngb = NGBSurvival(n_estimators=50)

clf = GridSearchCV(ngb, param_grid=param_grid, cv=3)

clf.fit(X, fit_params={"T": T, "E": E})

(I'm not actually sure about the last line, since a standard sklearn estimator just takes (X, y) when fitting, but the current error occurs before that anyway.)

This raises the error
RuntimeError: Cannot clone object NGBSurvival(Dist=<class 'ngboost.distns.utils.SurvivalDistnClass<locals>.SurvivalDistn'>, n_estimators=50, random_state=RandomState(MT19937) at 0x7FA5683B1940), as the constructor either does not set or modifies parameter Dist

It does however work if I use NGBRegressor instead of NGBSurvival (and remove E).

Is there a way for me to fix this, or is this a problem with the NGBSurvival class? And if the latter, is it possible to fix?

Hmm, this is a problem with the way that NGBSurvival is implemented- specifically in the way that the abstractions for scores with and without survival data are designed. It's not an easy fix, unfortunately. I welcome suggestions, though. Related: #217