Kformer

Code for our NLPCC 2022 paper Kformer: Knowlede Injection in Transformer Feed-Forward Layers

The project is based on Fairseq.

Requirements

To install requirements:

cd fairseq
./setup.sh

Download Model

mkdir models
cd models
wget https://dl.fbaipublicfiles.com/fairseq/models/roberta.base.tar.gz
tar -zxvf roberta.base.tar.gz

Data

You can download the data from ZJU Cloud and put it under the .\data\. The data we provide here is the question with the retrieved knowledge using bm25.

Run the experiments

Finetuning

Social IQA

Use the command below to finetune SocialIQA on Kformer. You can change the layer to inject by editing the arg --knowledge_layer. --knowledge_layer contains two arguments [a,b) denoting the interval of the layer of Roberta. You need to change this line to change the number of the knowledge used for infusion.

./fairseq/run_social.sh

MedQA

Use the command below to finetune MedQA on Kformer.

./fairseq/run_med.sh

Evaluation

Use the following command to evalute the finetuned model. Set the --knowledge_layer the same as the arg during finetuning.

export ModelPath = $ModelPath$
export DataPath = $DataPath$
python fairseq/test_social.py --model_path $ModelPath$ --knowledge_layer 9 12 --data_file $DataPath$

Change fairseq/test_social.py to test_med.py to evaluate MedQA.

If you find this repo helpful...

Please give us a ⭐ and cite our paper as

@article{Yao2022KformerKI,
  title={Kformer: Knowledge Injection in Transformer Feed-Forward Layers},
  author={Yunzhi Yao and Shaohan Huang and Li Dong and Furu Wei and Huajun Chen and Ningyu Zhang},
  journal={ArXiv},
  year={2022},
  volume={abs/2201.05742}
}

About

Code for the NLPCC 2022 paper "Kformer: Knowledge Injection in Transformer Feed-Forward Layers"

MIT License

Languages

Language:Python 96.8%Language:Cuda 1.4%Language:C++ 0.6%Language:Cython 0.5%Language:Shell 0.4%Language:Lua 0.2%Language:Batchfile 0.0%Language:Makefile 0.0%