This is the source code for paper A Comparative Study of Training Objectives for Clarification Facet Generation(SIGIR-AP 2023)
- Python version >= 3.8
- PyTorch version >= 1.11.0
- Transformers version >= 4.27.2
- MIMICS: microsoft/MIMICS: MIMICS: A Large-Scale Data Collection for Search Clarification (github.com)
- You should download
The Bing API's Search Results for MIMICS Queries
- You should download
We just provide model weights in the following websites(Seq-default
contains both model weights and the tokenizer). The tokenizer for each model is the same with the tokenizer used in Bart-base
(https://huggingface.co/facebook/bart-base)
seq-default
: https://huggingface.co/algoprog/mimics-bart-baseseq-min-perm
: https://huggingface.co/Shiyunee/seq-min-permseq-avg-perm
: https://huggingface.co/Shiyunee/seq-avg-permset-pred
: https://huggingface.co/Shiyunee/set-predseq-set-pred
: https://huggingface.co/Shiyunee/seq-set-pred
data_process.py
: prepare data for inferenceinference.py
: generate facets for the given dataevaluation.py
: evaluate the resultsscore.py
: replacescore.py
inbert_score
(package) with this file so we can load the model and tokenizer only once