QAGCN: Answering Multi-Relation Questions via Single-Step Implicit Reasoning over Knowledge Graphs

Ruijie Wang, Luca Rossetto, Michael Cochez, and Abraham Bernstein

To appear at the 21st European Semantic Web Conference (ESWC 2024).

Arxiv preprint: QAGCN: Answering Multi-Relation Questions via Single-Step Implicit Reasoning over Knowledge Graphs.

Environment Setup

Please set up a Python environment with Pytorch, Pytorch Geometric, Pytorch Scatter, Transformers, NetworkX, and Graph-tool installed.

Data and Models

Please download the prepared data and our pretrained models for MetaQA, PathQuestion (PQ), and PathQuestion-Large (PQL) from OSF. (Unzip data.zip and move it to the root directory.)

Inference using Pre-trained Models

The following commands can be used to load and evaluate the pre-trained models.

MetaQA

cd qa_metaqa

# evaluate the pre-trained model on MetaQA-1hop
touch MetaQA_1hop_eval.log
python -u main.py --num_epochs 0 --path_align_timestamp 2022.02.12.20.00 --timestamp 2022.02.13.15.09 >> MetaQA_1hop_eval.log

# evaluate the pre-trained model on MetaQA-2hop
touch MetaQA_2hop_eval.log
python -u main.py --qa_type 2-hop --in_dims 768 512 --out_dims 512 256 --dropouts 0.1 0. --rerank_top 200 --num_epochs 0 --path_align_timestamp 2022.02.12.20.18 --timestamp 2022.02.15.21.17 >> MetaQA_2hop_eval.log

# evaluate the pre-trained model on MetaQA-3hop
touch MetaQA_3hop_eval.log
python -u infer.py --path_align_timestamp 2022.02.15.19.30 --timestamp 2022.02.19.12.28 >> MetaQA_3hop_eval.log

PathQuestion

cd qa_pq

# evaluate the pre-trained model on PQ-2hop
touch PathQuestion_2hop_eval.log
python -u main.py --num_epochs 0 --path_align_timestamp 2022.02.20.20.06 --timestamp 2022.02.20.20.55 >> PathQuestion_2hop_eval.log

# evaluate the pre-trained model on PQ-3hop
touch PathQuestion_3hop_eval.log
python -u main.py --qa_type 3-hop --num_epochs 0 --in_dims 768 512 256 --out_dims 512 256 128 --dropouts 0.1 0.1 0. --rerank_top 50 --path_align_timestamp 2022.03.13.18.11 --timestamp 2022.02.19.23.45 >> PathQuestion_3hop_eval.log

PathQuestion-Large

cd qa_pql

# evaluate the pre-trained model on PQL-2hop
touch PathQuestionLarge_2hop_eval.log
python -u main.py --num_epochs 0 --path_align_timestamp 2023.12.08.02.04 --timestamp 2023.12.08.03.10 >> PathQuestionLarge_2hop_eval.log

# evaluate the pre-trained model on PQL-3hop
touch PathQuestionLarge_3hop_eval.log
python -u main3.py --num_epochs 0 --path_align_timestamp 2023.12.08.06.07 --timestamp 2023.12.08.06.45 >> PathQuestionLarge_3hop_eval.log

Training New Models

The following commands can be used to train new models.

Please take a note of the timestamp of path_train.py and fill it in the placeholder [timestamp] below for running main.py.

MetaQA

cd qa_metaqa

# train the model on MetaQA-1hop (including data preprocessing and training)
touch MetaQA_1hop_train.log
python -u kg_prep.py >> MetaQA_1hop_train.log
python -u que_prep.py --qa_type 1-hop >> MetaQA_1hop_train.log
python -u path_train.py >> MetaQA_1hop_train.log
python -u main.py --path_align_timestamp [timestamp] >> MetaQA_1hop_train.log

# train the model on MetaQA-2hop (including data preprocessing and training)
touch MetaQA_2hop_train.log
python -u que_prep.py --qa_type 2-hop >> MetaQA_2hop_train.log
python -u path_train.py --qa_type 2-hop --lr 5e-4 >> MetaQA_2hop_train.log
python -u main.py --qa_type 2-hop --path_align_timestamp [timestamp] --in_dims 768 512 --out_dims 512 256 --dropouts 0.1 0. --rerank_top 200 --lr 5e-4 >> MetaQA_2hop_train.log

# train the model on MetaQA-3hop (including data preprocessing and training)
touch MetaQA_3hop_train.log
# please note that the path extraction for 3-hop questions in que_prep.py is memory-consuming. it is recommended to run it on a machine with at least 500GB RAM. 
# (due to the memory swapping mechanism of linux, running on a machine with insufficient RAM may cause the machine to be unresponsive!!!)
python -u que_prep.py --qa_type 3-hop >> MetaQA_3hop_train.log
python -u path_train.py --qa_type 3-hop --out_dim 128 --lr 1e-3 >> MetaQA_3hop_train.log
python -u main.py --qa_type 3-hop --path_align_timestamp [timestamp] --in_dims 768 512 256 --out_dims 512 256 128 --dropouts 0.1 0.1 0. --rerank_top 8000 --lr 1e-3 >> MetaQA_3hop_train.log

PathQuestion

cd qa_pq

# train the model on PQ-2hop (including data preprocessing and training)
touch PathQuestion_2hop_train.log
python -u kg_prep.py >> PathQuestion_2hop_train.log
python -u que_prep.py >> PathQuestion_2hop_train.log
python -u path_train.py >> PathQuestion_2hop_train.log
python -u main.py --path_align_timestamp [timestamp] >> PathQuestion_2hop_train.log

# train the model on PQ-3hop (including data preprocessing and training)
touch PathQuestion_3hop_train.log
python -u kg_prep.py --qa_type 3-hop >> PathQuestion_3hop_train.log
python -u que_prep.py --qa_type 3-hop >> PathQuestion_3hop_train.log
python -u path_train.py --qa_type 3-hop --in_dim 768 --out_dim 128 --lr 1e-3 >> PathQuestion_3hop_train.log
python -u main.py --qa_type 3-hop --path_align_timestamp [timestamp] --in_dims 768 512 256 --out_dims 512 256 128 --dropouts 0.1 0.1 0. --lr 1e-3 --rerank_top 50 >> PathQuestion_3hop_train.log

PathQuestion-Large

cd qa_pql

# train the model on PQL-2hop (including data preprocessing and training)
touch PathQuestionLarge_2hop_train.log
python -u kg_prep.py >> PathQuestionLarge_2hop_train.log
python -u que_prep.py >> PathQuestionLarge_2hop_train.log
python -u path_train.py >> PathQuestionLarge_2hop_train.log
python -u main.py --path_align_timestamp [timestamp] >> PathQuestionLarge_2hop_train.log

# re-train the model on PQL-3hop (including data preprocessing and training)
touch PathQuestionLarge_3hop_train.log
python -u kg_prep.py --qa_type 3-hop >> PathQuestionLarge_3hop_train.log
python -u que_prep.py --qa_type 3-hop >> PathQuestionLarge_3hop_train.log
python -u path_train.py --qa_type 3-hop --out_path ../data/pql-3hop/output --out_dim 128 --lr 5e-4 >> PathQuestionLarge_3hop_train.log
python -u main3.py --path_align_timestamp [timestamp] >> PathQuestionLarge_3hop_train.log

Question Classification

The following commands can be used to train the question classifier that predicts the complexity of questions (i.e., 1, 2, or 3 hops).

cd question_classifier

# train and evaluate the classifier on MetaQA
touch MetaQA_classifier_train.log
python hop_pred_m.py >> MetaQA_classifier_train.log

# train and evaluate the classifier on PathQuestion
touch PathQuestion_classifier_train.log
python hop_pred_pq.py >> PathQuestion_classifier_train.log

# train and evaluate the classifier on PathQuestionLarge
touch PathQuestionLarge_classifier_train.log
python hop_pred_pql.py >> PathQuestionLarge_classifier_train.log

Citation

@misc{wang2023qagcn,
      title={QAGCN: Answering Multi-Relation Questions via Single-Step Implicit Reasoning over Knowledge Graphs}, 
      author={Ruijie Wang and Luca Rossetto and Michael Cochez and Abraham Bernstein},
      year={2023},
      eprint={2206.01818},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

ruijie-wang-uzh / QAGCN

QAGCN: Answering Multi-Relation Questions via Single-Step Implicit Reasoning over Knowledge Graphs

Environment Setup

Data and Models

Inference using Pre-trained Models

MetaQA

PathQuestion

PathQuestion-Large

Training New Models

MetaQA

PathQuestion

PathQuestion-Large

Question Classification

Citation

About

Languages