yubin-park / palobst

PaloBoost is an overfitting-robust Gradient Boosting algorithm.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PaloBst

PaloBst is an over-fitting robust Gradient Boosting Decision Tree algorithm. The details of the algorithm are illustrated in "Tackling Overfitting in Boosting for Noisy Healthcare Data, IEEE TKDE".

To use the package, you need to install numba >= 0.46.

To install the package, clone the repository, and run:

# cd palobst
$ python setup.py develop

Regression Example

from sklearn.datasets import make_friedman1
from sklearn.model_selection import train_test_split

X, y = make_friedman1(100)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

from palobst import PaloBst

model = PaloBst(distribution="gaussian")
model.warmup() # this runs JIT 

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

# feature importances
print(model.feature_importances_)

Please see tests/test_regression.py for more details.

Classification Example

from sklearn.datasets import make_hastie_10_2
from sklearn.model_selection import train_test_split

X, y = make_hastie_10_2(100)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

from palobst import PaloBst

model = PaloBst(distribution="bernoulli")
model.warmup() # this runs JIT 

model.fit(X_train, y_train)

y_pred = model.predict_proba(X)[:,1]

# feature importances
print(model.feature_importances_)

Please see tests/test_classification.py for more details.

Reference

About

PaloBoost is an overfitting-robust Gradient Boosting algorithm.


Languages

Language:Python 100.0%