@misc{he2023explanations,
title={Explanations as Features: LLM-Based Features for Text-Attributed Graphs},
author={Xiaoxin He and Xavier Bresson and Thomas Laurent and Bryan Hooi},
year={2023},
eprint={2305.19523},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
conda create --name TAPE python=3.8
conda activate TAPE
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
conda install -c pyg pytorch-sparse
conda install -c pyg pytorch-scatter
conda install -c pyg pytorch-cluster
conda install -c pyg pyg
pip install ogb
conda install -c dglteam/label/cu113 dgl
pip install yacs
pip install transformers
Dataset | Description |
---|---|
ogbn-arxiv | The OGB provides the mapping from MAG paper IDs into the raw texts of titles and abstracts. Download the dataset here, unzip and move it to dataset/ogbn_arxiv_orig . The dataset size is 200M. |
Cora | Download the dataset here and move it to dataset/cora_orig . The dataset size is 2.6G. |
PubMed | Download the dataset here and move it to dataset/PubMed_orig . The dataset size is 115M. |
Dataset | Description |
---|---|
ogbn-arxiv | Download the dataset here and move it to gpt_responses/ogbn_arxiv . The dataset size is 662M. |
Cora | Download the dataset here and move it to gpt_responses/cora . The dataset size is 11M. |
PubMed | Download the dataset here and move it to gpt_responses/PubMed . The dataset size is 77M. |
WANDB_DISABLED=True TOKENIZERS_PARALLELISM=False CUDA_VISIBLE_DEVICES=0,1,2,3 python -m core.trainLM dataset ogbn-arxiv
WANDB_DISABLED=True TOKENIZERS_PARALLELISM=False CUDA_VISIBLE_DEVICES=0,1,2,3 python -m core.trainLM dataset ogbn-arxiv use_gpt True
python -m core.trainEnsemble gnn.model.name GCN
python -m core.trainEnsemble gnn.model.name SAGE
python -m core.trainEnsemble gnn.model.name RevGAT gnn.train.use_dgl True gnn.train.lr 0.002 gnn.train.dropout 0.75
# Our enriched features
python -m core.trainEnsemble gnn.train.feature_type TA_P_E
# Our individual features
python -m core.trainGNN gnn.train.feature_type TA
python -m core.trainGNN gnn.train.feature_type E
python -m core.trainGNN gnn.train.feature_type P
# OGB features
python -m core.trainGNN gnn.train.feature_type ogb
Use run.sh
to run the codes and reproduce the published results.