Code for our COLING2020 paper
The code runs with python 3.6. All dependencies are listed in requirements.txt
pip install -r requirements.txt
Download QA dataset in CommonsenseQA and place it in the folder name 'dataset'
Download Artifacts:
./scripts/download_artifacts.sh
Download Preprocessed ConceptNet in here and place it in conceptnet folder
Make QA dataset have fake AMR dataset form (Please check before you run the code)
cd dataset
python preprocess.py
Then, place train.txt, dev.txt, test.txt set into ./dataset/csqa/train, ./dataset/csqa/dev, ./dataset/csqa/test folder respectively
Prepare train/dev/test data:
cd ..
./scripts/prepare_data.sh -v 2 -p [project_path]
We use Stanford CoreNLP (version 3.9.2) for tokenizing.
First, start a CoreNLP server.
Then, annotate QA sentences:
./scripts/annotate_features.sh amr_data/amr_2.0/csqa
Data Preprocessing
./scripts/preprocess_2.0.sh
Then, Make QA dataset sentences' AMR data using stog's pretrained model.
Before u run this code, please make sure you modify config file.
python -u -m stog.commands.predict \
--archive-file ckpt-amr-2.0 \
--weights-file ckpt-amr-2.0/best.th \
--input-file data/AMR/amr_2.0/train.txt.features.preproc \
--batch-size 2 \
--use-dataset-reader \
--cuda-device 0 \
--output-file amr_data/amr_2.0/train.pred.txt \
--silent \
--beam-size 5 \
--predictor STOG
Then, prepare vocab dataset
sh ./scrips/prepare.sh
It will takes some time as we write all the paths of the ACP graph. You will need enoguh space to save the data. (mnt folder would be fine choice)
cd prepare
python generate_batch.py 50 10 10 # train/dev/test
python generate_prepare.py generate_bash 50 10 10 AMR_CN_PRUNE
sh cmd_extract_train.sh
sh cmd_extract_dev.sh
python generate_prepare.py combine 50 10 10 train/dev/test
python divide_inhouse_data.py
Train our model
sh train.sh
Evaluate our model
sh evaluate.sh
We adopted some modules or code snippets from AllenNLP, sheng-z/stog, jcyk/gtos. Thanks to these open-source projects!
For any questions, please send me an email to Jungwoo Lim(wjddn803@korea.ac.kr)