metric=auc
qiaohaoforever opened this issue · comments
qiaohaoforever commented
It's great job!
can metric=auc,when I want to use Classifier?
Igor Ivanov commented
Thanks!
You can define any metric you want in the form: def my_metric(y_true, y_pred):
.
If your metric needs class labels in y_pred
you call stacking
function with needs_proba=False
.
If your metric needs probabilities in y_pred
you call stacking
function with needs_proba=True
.
Below I'll show how to define ROC AUC metric which works for both binary and multiclass classification. The easiest way is to use roc_auc_score
from scikit-learn package. But to make it work we need to transform true class labels into one-hot encoding.
Please look at the complete example:
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import OneHotEncoder
from vecstack import stacking
# Define ROC AUC metric
def auc(y_true, y_pred):
"""ROC AUC metric for both binary and multiclass classification.
Parameters
----------
y_true : 1d numpy array
True class labels
y_pred : 2d numpy array
Predicted probabilities for each class
"""
ohe = OneHotEncoder(sparse=False)
y_true = ohe.fit_transform(y_true.reshape(-1, 1))
auc_score = roc_auc_score(y_true, y_pred)
return auc_score
# Create data: 500 example, 5 feature, 3 classes
X, y = make_classification(n_samples=500, n_features=5,
n_informative=3, n_redundant=1,
n_classes=3, flip_y=0,
random_state=0)
# Make train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2,
random_state=0)
# Init 1st level models
models = [
RandomForestClassifier(random_state=0, n_jobs=-1,
n_estimators=100, max_depth=3),
]
# Perform stacking
S_train, S_test = stacking(models,
X_train, y_train, X_test,
regression=False, # classification task
needs_proba=True, # predict probabilities
metric=auc, # metric
verbose=2)