Path-based Graph Neural Network Explanation for Heterogeneous Link Prediction (PaGE-Link)

Code for the WWW2023 paper PaGE-Link: Path-based Graph Neural Network Explanation for Heterogeneous Link Prediction by Shichang Zhang, Jiani Zhang, Xiang Song, Soji Adeshina, Da Zheng, Christos Faloutsos , and Yizhou Sun.

Getting Started

Requirements

Please follow the links below to install PyTorch and DGL with proper CUDA versions
- PyTorch https://pytorch.org/
- DGL https://www.dgl.ai/pages/start.html
Then install packages by running the line below

pip install -r requirements.txt

Our code has been tested with
- Python = 3.10.6
- PyTorch = 1.12.1
- DGL = 0.9.1

To citation dataset used in the paper is under datasets/. The dataset is after augmentaion, so edges of type likes have been added. Similarly for the synthetic dataset. For details of the datasets, please refer to the paper.

You may also add your favourite datasets by modifying the load_dataset function in dataset_processing.py.

GNN Model

We implement the RGCN model on heterogeneous graph in model.py. A pre-trained model checkpoint is stored in saved_models/.

Explainer Usage

Run PaGE-Link to explain trained GNN models
- A simple example is shown below
```
  python pagelink.py --dataset_name=aug_citation --save_explanation
```
- Hyperparameters maybe specified in the .yaml file and pass to the script using the --config_path argument.
```
  python pagelink.py --dataset_name=synthetic --config_path=config.yaml --save_explanation
```

Train new GNNs for explanation

Run train_linkpred.py as the examples below

python train_linkpred.py --dataset_name=aug_citation --save_model --emb_dim=128 --hidden_dim=128 --out_dim=128

python train_linkpred.py --dataset_name=synthetic --save_model --emb_dim=64 --hidden_dim=64 --out_dim=64

Run baselines
- A simple example is shown below, replace method with gnnexplainer-link or pgexplainer-link.
```
python baselines/{method}.py --dataset_name=aug_citation
```

Results

Quantitative

Evaluate saved PaGE-Link explanations

python eval_explanations.py --dataset_name=synthetic --emb_dim=64 --hidden_dim=64 --out_dim=64 --eval_explainer_names=pagelink

Note: As exact reproducibility is not guaranteed with PyTorch even with identical random seed (See https://pytorch.org/docs/stable/notes/randomness.html), the results may be slightly off from the paper.

Qualitative

Example of path explanations output by PaGE-Link. Node information are showing on the right. Top three paths (green arrows) selected by PaGE-Link for explaining the predicted link (𝑎328, 𝑝5670) (dashed red). The selected paths are short and do not go through a generic field of study like “Computer Science”.

Please refer to plotting_example.ipynb for an example of generating plots like this.

Cite

Please cite our paper if you find this code is useful. Thank you.

@inproceedings{zhang2023page,
  title={PaGE-Link: Path-based graph neural network explanation for heterogeneous link prediction},
  author={Zhang, Shichang and Zhang, Jiani and Song, Xiang and Adeshina, Soji and Zheng, Da and Faloutsos, Christos and Sun, Yizhou},
  booktitle={Proceedings of the web conference 2023},
  year={2023}
}

Contact Us

Please open an issue or contact shichang@cs.ucla.edu if you have any questions.

amazon-science / page-link-path-based-gnn-explanation