AmenRa / ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍

Home Page:https://amenra.github.io/ranx

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] Misleading exception message on dataframe types

efung opened this issue · comments

Describe the bug
I'm using the library for the first time with a Pandas dataframe and ran into an exception that was misleading.

To Reproduce
Steps to reproduce the behavior:

  1. Create a dataframe where the id column is of type int64 e.g. df['id'] = df.index + 1
  2. Create the qrel like this:
qrels = Qrels.from_df(
    df=df,
    q_id_col="id",
    doc_id_col="best_document",
    score_col="score",
)
  1. Observe this error:
[/usr/local/lib/python3.10/dist-packages/ranx/data_structures/qrels.py](https://localhost:8080/#) in from_df(df, q_id_col, doc_id_col, score_col)
    293         """
    294         assert (
--> 295             df[q_id_col].dtype == "O"
    296         ), "DataFrame scores column dtype must be `object` (string)"
    297         assert (

AssertionError: DataFrame scores column dtype must be `object` (string)

Expected behavior
The assertion message should point to the correct column, in this case, it is the ID column that is of the wrong type. From inspecting the code, the assertion message is wrong when the document ID column is of the wrong type as well.

Hi, sorry for that!
I probably copy-pasted or duplicated lines there.
I will fix it in the next release.

Fixed in v0.3.17.

Please, give ranx a star if you haven't yet.