CogKR

arXiv

Cognitive Knowledge Graph Reasoning for One-shot Relational Learning

Zhengxiao Du, Chang Zhou, Ming Ding, Hongxia Yang, Jie Tang

Under construction.

Prerequisites

Python 3
PyTorch >= 1.1.0
NVIDIA GPU + CUDA cuDNN

Getting Started

Installation

Clone this repo

git clone https://github.com/THUDM/CogKR
cd CogKR

Please install dependencies by

pip install -r requirements.txt

Dataset

Two public datasets NELL-One and Wiki-One (slightly modified) are used in our experiment. The original datasets can be downloaded from here.

You can download the preprocessed datasets from the link in OneDrive. If you're in regions where OneDrive is not available (e.g. Mainland China), try to the link in Tsinghua Cloud.

After downloading the dataset, please unzip it into the datasets folder.

To use your own dataset, see the "Use your dataset" part below.

Training

For training, simply sun

python src/main.py --directory {dataset_path} --gpu {gpu_id} --config {config_file} --load_embed DistMult --comment {experiment_name}

Use dataset_path to specify the path to the dataset, which in our expeirment should be datasets/NELL or datasets/Wiki.

Use gpu_id to specify the id of the gpu to use.

config_file is used to specify the configuration file for experimental settings and hyperparameters. Different configurations for two datasets in the paper are stored under the configs/ folder. config-nell.json and config-wiki.json are used to train the complete model. config-nell-onlyr.json and config-wiki-onlyr.json are used to train the CogKR-onlgR model for abalation study.

experiment_name is used to specify the name of the experiment.

If you suffer from out of memory error when running experiments on Wiki-One, try to run the code with --sparse_embed to use sparse gradient for the embedding layer

Evaluation

For evaluation, simply run

python src/main.py --inference --directory {dataset_path} --gpu {gpu_id} --config {config_file} --load_embed DistMult --load_state {state_file}

Use Your Own Dataset

To use your own dataset, please put the files of the dataset under datasets/ in the following structure:

-{dataset_name}
	-data
    -train.txt
    -valid_support.txt
    -valid_eval.txt
    -test_support.txt
    -test_eval.txt
    -ent2id.txt (optional)
    -relation2id.txt (optional)
    -entity2vec.{embed_name}
    -relation2vec.{embed_name}
    -rel2candidates.json (optional)

train.txt,valid_support.txt, valid_eval.txt, test_support.txt and test_eval.txt correspond to the facts of training relations, support facts and evaluate facts of validation relations and support facts and evaluate facts of test relations. Each line is in the format of {head}\t{relation}\t{tail}\n.

ent2id.txt, relation2id.txt, entity2vec.{embed_name} and relation2vec.{embed_name} are used for pretrained KG embeddings. The usage of pretrained embeddings is not required but highly recommended. Each line of ent2id.txt or relation2id.txt is the entity/relation name whose id is the line number(starting from 0). Each line of entity2vec.{embed_name} or relation2vec.{embed_name} is the vector of the entity/relation whose id is the line number.

rel2candidates.json represents the candidate entities of test and validation relations.

Firstly, preprocess the data

python src/main.py --directory datasets/{dataset_name} --process_data

Then you can train the model according to the "Training" part.

There are also two files evaluate_graphs and fact_dist in our preprocessed dataset. fact_dist is used to skip some facts in the training set. To generate the file, please run

python src/main.py --directory datasets/{dataset_name} --config {config_file} --get_fact_dist

evaluate_graphs is generated according to rel2candidates.json. evaluate_graphs is used to limit the expansion of the cognitive graph to candidate entities provided by rel2candidates.json.

To generate evaluate_graphs, please run

python src/main.py --directory datasets/{dataset} --search_evaluate_graph

Or on the Wiki-One dataset only:

python src/main.py --directory datasets/Wiki --search_evaluate_graph --wiki

Cite

Please cite our paper if you use the code or datasets in your own work:

@article{du2019cogkr,
  author    = {Zhengxiao Du and
               Chang Zhou and
               Ming Ding and
               Hongxia Yang and
               Jie Tang},
  title     = {Cognitive Knowledge Graph Reasoning for One-shot Relational Learning},
  journal   = {CoRR},
  volume    = {abs/1906.05489},
  year      = {2019}
}

jianzhu / CogKR