Explainable Integration of Knowledge Graphs using Large Language Models
It is continuous work from the NMV-LS system for translating complex link specifications into natural language that leverages a large language model.
Environment and Dependencies
Ubuntu 10.04.2 LTS
python 3.6+
torch 1.7.0
Datasets for the standard Encoder-Decoder architectures (in data folder)
There are three different datasets with following size:
- 107k of pairs (English)
- 1m of pairs (English)
- 73k of pairs (German)
We provide splitted dataset of each dataset in the data folder. Unzip all of zip files, which are each of dataset consists of train, dev, and test sets.
Datasets for few-shot learning scenarios (in datasets folder)
There are four different datasets as follows:
- LIMES silver
- Human annotated LIMES silver
- Human annotated LIMES LS
- Human annotated SILK LS
Installation
Download NVM-LS repository:
https://github.com/u2018/NMV-LS-T5.git
Install dependencies:
pip install -r requirements.txt
Usage
Standard encoder-decoder architecture
Configuration
mode: all # Mode all denotes that you will execute the code on training and testing consecutively. You can choose train or test mode if you want to run separately.
max_length: 107 # The max length of sentence is 107
directory: data/107K/ # Directory denotes where is the path of splitted dataset (train, dev, and test sets)
n_epoch: 100 # n epoch shows how many epochs to train the model
bidirectional: True # Bidirectional parameter is used for NMV-LS with Bi/LSTM model. If the value of bidirectional is true that shows BiLSTM model is used on training the model.
NMV-LS with GRU
To run NMV-LS with GRU model
$ python main.py --directory data/107K/ --max_length 107 --mode all --n_epoch 100
NMV-LS with Bi/LSTM
To run NMV-LS with BiLSTM model
$ python nmv-ls_bilstm.py --directory data/107K/ --max_length 107 --mode all --n_epoch 100 --bidirectional True
NMV-LS with Transformers
To run NMV-LS with Transformer model
$ python nmv-ls_transformer.py --directory data/107K/ --max_length 107 --mode all --n_epoch 30
Few-shot learning using T5 model
Run NMVLS_few_shot_learning_using_T5_model.ipynb on Google colab
How to Cite
@article{asep2023nmvls,
author = {Ahmed, Abdullah Fathi and Firmansyah, Asep Fajar and Sherif, Mohamed Ahmed and Mousallem, Diego and Ngomo, Axel-Cyrille Ngonga},
biburl = {https://www.bibsonomy.org/bibtex/24fd6060de085e75628f880e6a316a098/dice-research},
title = {Explainable Integration of Knowledge Graphs using Large Language Models},
url = {https://svn.dice-research.org/open/papers/2023/NLDB_NMVLS/public.pdf},
year = 2023
}
Contact
If you have any questions or feedbacks, feel free to contact us at asep.fajar.firmansyah@upb.de