aethelind / fultr

final code for fultr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FULTR: Policy-Gradient Training of Fair and Unbiased Ranking Functions

The implementation for SIGIR 2021 paper (arxiv):

Policy-Gradient Training of Fair and Unbiased Ranking Functions

Himank Yadav*, Zhengxiao Du*, Thorsten Joachims (*: equal contribution)

Installation

Clone the repo

git clone https://github.com/him229/fultr
cd fultr

Please first install PyTorch, and then install other dependencies by

pip install -r requirements.txt

Getting Started

Script main.sh contains commands for running various experiments (based on slurm) in the paper.

Data

datasets folder contains links to download the datasets used for experiments and code we used to transform the datasets to make them suitable for training.

transformed_datasets folder contains the final version of the transformed dataset that we directly use for training.

We use MSLR and German Credit Datasets for training. They can be found online using the links below:

MSLR-WEB30K (Fold 1) - https://www.microsoft.com/en-us/research/project/mslr/

German Credit - https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)

Preprocess

To reproduce the preprocessing process, please download the raw datasets and save files in transformed_datasets/german/raw and transformed_datasets/mslr/raw respectively. (All of the commands should be executed under the datasets directory.)

German Credit Dataset

This dataset contains information about 1000 individuals which are randomly split into train, validation, and test sets with ratio 1:1:1. We convert this to an LTR dataset by sampling 20 individuals from the each set with ratio 9:1 for non-creditworthy individuals to creditworthy individuals for each query.

Group attribute - A binary feature indicating whether the purpose is radio/television (attribute id A43)

Command

python preprocess-german.py

MSLR

We adopt the train, validation, and test split provided with the dataset. We binarize relevances by assigning 1 to items that were judged as 3 or 4 and 0 to judgments 0, 1, and 2. Next, we remove queries with less than 20 candidates (to better compare different methods and amplify differences). For the remaining queries, we sample 20 candidate items with at most 3 relevant items for each query.

Group attribute - QualityScore (feature id 133) with 40th percentile as the threshold

Command

python preprocess-mslr.py

Using You Own Dataset

First save your dataset into the same format as MSLR, then run the following command:

python preprocess-mslr.py --raw_directory <data directory> --output_directory <output directory> --no_log_features

Click Data

We first train a conventional Ranking SVM with 1 percent of the full-information training data as the logging policy. This logging policy is then used to generate the rankings for which click data is logged.

The click data is generated by simulating the position-based examination model. We use a position bias that decays with the presented rank k of the item as v(k) = (1/k)^n with n=1 as default. (file: generate_clicks_for_dataset.py)

Command

To train the production ranker, first download the SVM-Rank (http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html) and Propensity SVM-Rank (http://www.cs.cornell.edu/people/tj/svm_light/svm_proprank.html) into svm_rank/ and svm_proprank/ and then compile the software according to the instructions.

Then run the following commands:

python production_ranker.py --dataset german
python production_ranker.py --dataset mslr

when using your own dataset

python production_ranker.py --dataset mslr --data_directory <output directory for previous preprocessing>

About

final code for fultr


Languages

Language:Python 96.5%Language:Shell 3.5%