yzhangcs / ctc-copy

[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".

Home Page:https://aclanthology.org/2023.emnlp-main.437/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Non-autoregressive Text Editing with Copy-aware Latent Alignments

1Soochow University, Suzhou, China
2Tencent AI Lab

conf arxiv citation python

image

Citation

If you are interested in our work, please cite

@inproceedings{zhang-etal-2023-ctc,
  title     = {Non-autoregressive Text Editing with Copy-aware Latent Alignments},
  author    = {Zhang, Yu  and
               Zhang, Yue  and
               Cui, Leyang  and
               Fu, Guohong},
  booktitle = {Proceedings of EMNLP},
  year      = {2023},
  address   = {Singapore}
}

Setup

The following packages should be installed:

Clone this repo recursively:

git clone https://github.com/yzhangcs/ctc-copy.git --recursive

You can follow this repo to obtain the 3-stage train/dev/test data for training a English GEC model. The multilingual datasets are available here.

Before running, you are required to preprocess each sentence pair into the format of SRC:\t[src]\nTGT:\t[tgt]\n, where src and tgt are the source and target sentences, respectively. Each sentence pair is separated by a blank line. See data/clang8.toy for examples.

Run

Try the following command to train a 3-stage English model,

bash train.sh

To make predictions & evaluations:

bash pred.sh

Contact

If you have any questions, please feel free to email me.

About

[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".

https://aclanthology.org/2023.emnlp-main.437/


Languages

Language:Python 97.4%Language:Shell 2.6%