recommenders-team / recommenders

Best Practices on Recommendation Systems

Home Page:https://recommenders-team.github.io/recommenders/intro.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ASK] MIND test dataset doesn't work for run_eval

ubergonmx opened this issue · comments

Description

The following code:

label = [0 for i in impr.split()]

It is essentially making each news ID in the impression list non-clicked.

Instead of modifying the code, I modified the test behaviors file and added -0 to each news ID in the impression list (e.g., N712-0 N231-0).
Now I get the following error after running run_eval:

model.run_eval(test_news_file, test_behaviors_file)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <timed exec>:1

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/newsrec/models/base_model.py:335, in BaseModel.run_eval(self, news_filename, behaviors_file)
    331 else:
    332     _, group_labels, group_preds = self.run_slow_eval(
    333         news_filename, behaviors_file
    334     )
--> 335 res = cal_metric(group_labels, group_preds, self.hparams.metrics)
    336 return res

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/deeprec/deeprec_utils.py:594, in cal_metric(labels, preds, metrics)
    591         res["hit@{0}".format(k)] = round(hit_temp, 4)
    592 elif metric == "group_auc":
    593     group_auc = np.mean(
--> 594         [
    595             roc_auc_score(each_labels, each_preds)
    596             for each_labels, each_preds in zip(labels, preds)
    597         ]
    598     )
    599     res["group_auc"] = round(group_auc, 4)
    600 else:

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/recommenders/models/deeprec/deeprec_utils.py:595, in <listcomp>(.0)
    591         res["hit@{0}".format(k)] = round(hit_temp, 4)
    592 elif metric == "group_auc":
    593     group_auc = np.mean(
    594         [
--> 595             roc_auc_score(each_labels, each_preds)
    596             for each_labels, each_preds in zip(labels, preds)
    597         ]
    598     )
    599     res["group_auc"] = round(group_auc, 4)
    600 else:

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_ranking.py:567, in roc_auc_score(y_true, y_score, average, sample_weight, max_fpr, multi_class, labels)
    565     labels = np.unique(y_true)
    566     y_true = label_binarize(y_true, classes=labels)[:, 0]
--> 567     return _average_binary_score(
    568         partial(_binary_roc_auc_score, max_fpr=max_fpr),
    569         y_true,
    570         y_score,
    571         average,
    572         sample_weight=sample_weight,
    573     )
    574 else:  # multilabel-indicator
    575     return _average_binary_score(
    576         partial(_binary_roc_auc_score, max_fpr=max_fpr),
    577         y_true,
   (...)
    580         sample_weight=sample_weight,
    581     )

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_base.py:75, in _average_binary_score(binary_metric, y_true, y_score, average, sample_weight)
     72     raise ValueError("{0} format is not supported".format(y_type))
     74 if y_type == "binary":
---> 75     return binary_metric(y_true, y_score, sample_weight=sample_weight)
     77 check_consistent_length(y_true, y_score, sample_weight)
     78 y_true = check_array(y_true)

File ~/.conda/envs/recommenders/lib/python3.9/site-packages/sklearn/metrics/_ranking.py:337, in _binary_roc_auc_score(y_true, y_score, sample_weight, max_fpr)
    335 """Binary roc auc score."""
    336 if len(np.unique(y_true)) != 2:
--> 337     raise ValueError(
    338         "Only one class present in y_true. ROC AUC score "
    339         "is not defined in that case."
    340     )
    342 fpr, tpr, _ = roc_curve(y_true, y_score, sample_weight=sample_weight)
    343 if max_fpr is None or max_fpr == 1:

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Any fix or workaround to this? How do I get the scores?

Other Comments

Originally posted by @ubergonmx in #1673 (comment)

I am trying to train the NAML model with the valid + test set.

it seems that is an error with AUC because there is just one class. It's like all your labels are one class.

I would try to look into the data and make sure you have positive and negagive classes

it seems that is an error with AUC because there is just one class. It's like all your labels are one class.

I would try to look into the data and make sure you have positive and negagive classes

Thank you. I think this was also an issue before, but it was closed as there's no test set with labels.

Sounds good