MCLEA

The code and dataset of paper Multi-modal Contrastive Representation Learning for Entity Alignment [arxiv] [acl] in Proceedings of COLING 2022 (oral).

Dataset

Bilingual datasets

The multi-modal version of DBP15K dataset comes from the EVA repository, and the folder pkls of DBP15K image features should be downloaded according to the guidance of EVA repository, and the downloaded folder pkls is placed in the data directory of this repository.

The word embedding we used is glove-6B, you can download it from glove, and unzip it into the data/embedding directory.

Cross-KG datasets

The original cross-KG datasets (FB15K-DB15K/YAGO15K) comes from MMKB, in which the image embeddings are extracted from the pre-trained VGG16. We use the image embeddings provided by MMKB and transform the data into the format consistent with DBP15K. The converted dataset can be downloaded from BaiduDisk (the password is stdt), and placed them in the data directory.

Training MCLEA

Bilingual datasets

Here is the example of training MLCEA on DBP15K.

bash run_dbp15k.sh 0 42 zh_en
bash run_dbp15k.sh 0 42 ja_en
bash run_dbp15k.sh 0 42 fr_en

Cross-KG datasets

Here is the example of training MCLEA on FB15K_DB15K with different ratio seeds. Similarly, you can replace the parameter FB15K_DB15K with FB15K_YAGO15K to train FB15K-YAGO15K dataset.

bash run_mmkb.sh 0 42 FB15K_DB15K 0.2
bash run_mmkb.sh 0 42 FB15K_DB15K 0.5
bash run_mmkb.sh 0 42 FB15K_DB15K 0.8

Citation

If you use this model or code, please cite it as follows:

@inproceedings{lin2022multi, 
  title = {Multi-modal Contrastive Representation Learning for Entity Alignment},
  author = {Lin, Zhenxi and Zhang, Ziheng and Wang, Meng and Shi, Yinghui and Wu, Xian and Zheng, Yefeng}, 
  booktitle = {Proceedings of the 29th International Conference on Computational Linguistics},
  url = {https://aclanthology.org/2022.coling-1.227},
  year = {2022},
  pages = {2572--2584},
}

Acknowledgement

Our codes are modified based on EVA, and we would like to thank their open-sourced work.

lzxlin / MCLEA

MCLEA

Dataset

Bilingual datasets

Cross-KG datasets

Training MCLEA

Bilingual datasets

Cross-KG datasets

Citation

Acknowledgement

About

Languages