jma127 / pyltr

Python learning to rank (LTR) toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset format

Jason-WT opened this issue · comments

What format does the model need to train and test? (Only the format of LETOR can be used in the model??)

Using the following seems to work for me:

  • TX: a pandas dataframe containing all features (except the target values and the group ids)
  • Ty: a pandas series containing the target values
  • Tqids: a pandas series (same length as Ty) cotaining the group id for each instance

Using numpy arrays instead of dataframes probably works as well