dnanhkhoa/multi-hop-analysis

Dataset Information

We use two datasets in our experiments: 2WikiMultihopQA and HotpotQA-small

We follow the steps in https://github.com/yuwfan/HGN to obtain file .gz data from raw data.

bash install_packages.sh

Download bigbird-roberta-base model from this link: https://huggingface.co/google/bigbird-roberta-base
Edit variables: data_dir, pretrained_model_dir, data_file
Run: python3 preprocess.py

python3 main.py

python3 predictor.py $checkpoint $data_file

python3 postprocess.py $prediction_file $processed_data_file $original_data_file

python3 official_evaluation.py path/to/prediction path/to/gold

Download our checkpoints
Run file predict_dev_all_settings.sh (Note: if you want to use this file for the test set in 2Wiki, comment line #25 about evaluation)

We base on HGN for data preprocessing.
We re-use the class Example from the HGN model and update it to work with our dataset.