adzhua / CTCResources

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CTCResources

Resources for Chinese text correction (CTC). The resource list is mainly mantained by Baoxin Wang and Honghong Zhao from HFL (哈工大讯飞联合实验室).

Contents

Defination

Chinese Spelling Check (CSC)

Chinese spelling check (CSC) is a task to detect and correct spelling errors in Chinese text.

Grammatical Error Correction (GEC)

Grammatical Error Correction (GEC) is the task of correcting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors.

Papers

CSC Papers

2021

PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction. ACL 2021.
Shulin Liu, Tao Yang, Tianchi Yue, Feng Zhang and Di Wang. [code].

PHMOSpell: Phonological and Morphological Knowledge Guided Chinese Spelling Check. ACL 2021.
Li Huang, Junjie Li, Weiwei Jiang, Zhiyu Zhang, Minchuan Chen, Shaojun Wang and Jing Xiao.

Exploration and Exploitation: Two Ways to Improve Chinese Spelling Correction Models. ACL 2021.
Chong Li, Cenyuan Zhang, Xiaoqing Zheng and Xuanjing Huang. [pdf], [code].

Dynamic Connected Networks for Chinese Spelling Check. Findings of ACL 2021.
Baoxin Wang, Wanxiang Che, Dayong Wu, Shijin Wang, Guoping Hu and Ting Liu.

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking. Findings of ACL 2021.
Heng-Da Xu, Zhongli Li, Qingyu Zhou, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang and Xian-Ling Mao. [pdf], [code].

Global Attention Decoder for Chinese Spelling Error Correction. Findings of ACL 2021.
Zhao Guo, Yuan Ni, Keqiang Wang, Wei Zhu and Guotong Xie.

Correcting Chinese Spelling Errors with Phonetic Pre-training. Findings of ACL 2021.
Ruiqing Zhang, Chao Pang, Chuanqiang Zhang, Shuohuan Wang, Zhongjun He, Yu Sun, Hua Wu and Haifeng Wang.

DCSpell: A Detector-Corrector Framework for Chinese Spelling Error Correction. SIGIR 2021.
Jing Li, Gaosheng Wu, Dafei Yin, Haozhao Wang, Yonggang Wang. [pdf].

2020

Chunk-based Chinese Spelling Check with Global Optimization. Findings of EMNLP 2020.
Zuyi Bao, Chen Li and Rui Wang. [pdf].

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check. ACL 2020.
Xingyi Cheng, Weidi Xu, Kunlong Chen, Shaohua Jiang, Feng Wang, Taifeng Wang, Wei Chu and Yuan Qi. [pdf], [code].

Spelling Error Correction with Soft-Masked BERT. ACL 2020.
Shaohua Zhang, Haoran Huang, Jicong Liu and Hang Li. [pdf].

2019

FASPell: A Fast, Adaptable, Simple, Powerful Chinese Spell Checker Based On DAE-Decoder Paradigm. EMNLP 2019 Workshop W-NUT.
Yuzhong Hong, Xianguo Yu, Neng He, Nan Liu, Junhui Liu. [pdf], [code].

Confusionset-guided Pointer Networks for Chinese Spelling Check. ACL 2019.
Dingmin Wang, Yi Tay, Li Zhong. [pdf].

GEC Papers

2021

Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding. ACL 2021.
Xin Sun, Tao Ge, Furu Wei, Houfeng Wang.[pdf].[code].

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical ErrorCorrection. ACL 2021.
Piji Li, Shuming Shi.[pdf].[code].

A Simple Recipe for Multilingual Grammatical Error Correction. ACL 2021 short.
Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn.[pdf].

Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models. EACL-BEA 2021.
Felix Stahlberg, Shankar Kumar.[pdf].

Document-level grammatical error correction. EACL-BEA 2021.
Zheng Yuan, Christopher Bryant.[pdf].

Data Strategies for Low-Resource Grammatical Error Correction. EACL-BEA 2021.
Simon Flachs, Felix Stahlberg, Shankar Kumar.[pdf].

Assessing Grammatical Correctness in Language Learning. EACL-BEA 2021.
Anisia Katinskaia, Roman Yangarber.[pdf].

Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction. NAACL2021.
Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua.[pdf].

Comparison of Grammatical Error Correction Using Back-Translation Models. NAACL2021 workshop.
Aomi Koyama, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[pdf].

2020

On the Robustness of Language Encoders against Grammatical Errors. ACL 2020.
Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang.[pdf].

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction. ACL 2020.
Masahiro Kaneko, Masato Mita, Shun Kiyono, Jun Suzuki, Kentaro Inui.[pdf].

Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner’s Error Tendency. ACL 2020 workshop.
Yujin Takahashi, Satoru Katsumata, Mamoru Komachi.[pdf].

GECToR – Grammatical Error Correction: Tag, Not Rewrite. ACL-BEA 2020.
Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub, Oleksandr Skurzhanskyi.[pdf].

A Comparative Study of Synthetic Data Generation Methods for Grammatical Error Correction. ACL-BEA 2020.
Max White, Alla Rozovskaya.[pdf].

Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples. EMNLP 2020.
Lihao Wang, Xiaoqing Zheng.[pdf].

Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction. EMNLP 2020.
Mengyun Chen, Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[pdf].

Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses. EMNLP 2020.
Simon Flachs, Ophélie Lacroix, Helen Yannakoudakis, Marek Rei, Anders Søgaard.[pdf].

Adversarial Grammatical Error Correction. findings of EMNLP 2020 .
Vipul Raheja, Dimitris Alikaniotis.[pdf].

A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction. findings of EMNLP 2020 .
Masato Mita, Shun Kiyono, Masahiro Kaneko, Jun Suzuki, Kentaro Inui.[pdf].

Improving Grammatical Error Correction with Machine Translation Pairs. findings of EMNLP 2020 .
Wangchunshu Zhou, Tao Ge, Chang Mu, Ke Xu, Furu Wei, Ming Zhou.[pdf].

Chinese Grammatical Correction Using BERT-based Pre-trained Model. AACL 2020.
Hongfei Wang, Michiki Kurosawa, Satoru Katsumata, Mamoru Komachi.[pdf].

Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model. AACL 2020.
Satoru Katsumata, Mamoru Komachi.[pdf].

Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction. COLING 2020.
Kengo Hotate, Masahiro Kaneko, Mamoru Komachi.[pdf].

Heterogeneous Recycle Generation for Chinese Grammatical Error Correction. COLING 2020.
Charles Hinson, Hen-Hsen Huang, Hsin-Hsi Chen.[pdf].

Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation. COLING 2020.
Zhaohong Wan, Xiaojun Wan, Wenguang Wang.[pdf].

Cross-lingual Transfer Learning for Grammatical Error Correction. COLING 2020.
Ikumi Yamashita, Satoru Katsumata, Masahiro Kaneko, Aizhan Imankulova,Mamoru Komachi.[pdf].

2019

Cross-Sentence Grammatical Error Correction. ACL 2019.
Shamil Chollampatt, Weiqi Wang, Hwee Tou Ng.[pdf].

Automatic Grammatical Error Correction for Sequence-to-sequence Text Generation: An Empirical Study. ACL 2019.
Tao Ge, Xingxing Zhang, Furu Wei, Ming Zhou.[pdf].

Controlling Grammatical Error Correction Using Word Edit Rate. ACL 2019.
Kengo Hotate, Masahiro Kaneko, Satoru Katsumata, Mamoru Komachi.[pdf].

Context is Key: Grammatical Error Detection with Contextual Word Representations. ACL-BEA 2019.
Samuel Bell, Helen Yannakoudakis, Marek Rei.[pdf].

The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction. ACL-BEA 2019.
Dimitris Alikaniotis, Vipul Raheja.[pdf].

(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus. ACL-BEA 2019.
Satoru Katsumata, Mamoru Komachi.[pdf].

Learning to combine Grammatical Error Corrections. ACL-BEA 2019.
Yoav Kantor, Yoav Katz, Leshem Choshen, Edo Cohen-Karlik, Naftali Liberman, Assaf Toledo, Amir Menczel, Noam Slonim.[pdf].

Erroneous data generation for Grammatical Error Correction. ACL-BEA 2019.
Shuyao Xu, Jiehao Zhang, Jin Chen, Long Qin.[pdf].

The CUED’s Grammatical Error Correction Systems for BEA-2019. ACL-BEA 2019.
Felix Stahlberg, Bill Byrne.[pdf].

CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction. ACL-BEA 2019.
Jakub Náplava, Milan Straka.[pdf].

Noisy Channel for Low Resource Grammatical Error Correction. ACL-BEA 2019.
Simon Flachs, Ophélie Lacroix, Anders Søgaard.[pdf].

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track. ACL-BEA 2019.
Masahiro Kaneko, Kengo Hotate, Satoru Katsumata, Mamoru Komachi.[pdf].

A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning. ACL-BEA 2019.
Yo Joong Choe, Jiyeon Ham, Kyubyong Park, Yeoil Yoon.[pdf].

Neural and FST-based approaches to grammatical error correction. ACL-BEA 2019.
Zheng Yuan, Felix Stahlberg, Marek Rei, Bill Byrne, Helen Yannakoudakis.[pdf].

Improving Precision of Grammatical Error Correction with a Cheat Sheet. ACL-BEA 2019.
Mengyang Qiu, Xuejiao Chen, Maggie Liu, Krishna Parvathala, Apurva Patil, Jungyeul Park.[pdf].

Multi-headed Architecture Based on BERT for Grammatical Errors Correction. ACL-BEA 2019.
Bohdan Didenko, Julia Shaptala.[pdf].

Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data. ACL-BEA 2019.
Roman Grundkiewicz, Marcin Junczys-Dowmunt, Kenneth Heafield.[pdf].

The Unbearable Weight of Generating Artificial Errors for Grammatical Error Correction. ACL-BEA 2019.
Phu Mon Htut, Joel Tetreault.[pdf].

An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction. EMNLP 2019.
Shun Kiyono, Jun Suzuki, Masato Mita, Tomoya Mizumoto, Kentaro Inui.[pdf].

Encode, Tag, Realize: High-Precision Text Editing. EMNLP 2019.
Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn.[pdf].

Personalizing Grammatical Error Correction: Adaptation to Proficiency Level and L1. EMNLP 2019.
Maria Nadejde, Joel Tetreault.[pdf].

Grammatical Error Correction in Low-Resource Scenarios. EMNLP 2019.
Jakub Náplava, Milan Straka.[pdf].

Minimally-Augmented Grammatical Error Correction. EMNLP 2019.
Roman Grundkiewicz, Marcin Junczys-Dowmunt.[pdf].

Parallel Iterative Edit Models for Local Sequence Transduction. EMNLP 2019.
Abhijeet Awasthi, Sunita Sarawagi, Rasna Goyal, Sabyasachi Ghosh, Vihari Piratla.[pdf].

Learning to Copy for Automatic Post-Editing. EMNLP 2019.
Xuancheng Huang, Yang Liu, Huanbo Luan, Jingfang Xu, Maosong Sun.[pdf].

Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data. NAACL 2019.
Wei Zhao, Liang Wang, Kewei Shen, Ruoyu Jia, Jingming Liu.[pdf].

Cross-Corpora Evaluation and Analysis of Grammatical Error Correction Models — Is Single-Corpus Evaluation Enough?. NAACL 2019.
Masato Mita, Tomoya Mizumoto, Masahiro Kaneko, Ryo Nagata, Kentaro Inui.[pdf].

Corpora Generation for Grammatical Error Correction. NAACL 2019.
Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, Simon Tong.[pdf].

Neural Grammatical Error Correction with Finite State Transducers. NAACL 2019.
Felix Stahlberg, Christopher Bryant, Bill Byrne.[pdf].

2018

Inherent Biases in Reference-based Evaluation for Grammatical Error Correction. ACL 2018.
Leshem Choshen, Omri Abend.[pdf].

Fluency Boost Learning and Inference for Neural Grammatical Error Correction. ACL 2018.
Tao Ge, Furu Wei, Ming Zhou.[pdf].

Automatic Metric Validation for Grammatical Error Correction. ACL 2018.
Leshem Choshen, Omri Abend.[pdf].

Overview of NLPTEA-2018 Share Task Chinese Grammatical Error Diagnosis. ACL 2018 NLPTEA.
Gaoqi Rao, Qi Gong, Baolin Zhang, Endong Xun.[pdf].

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task. NAACL 2018.
Marcin Junczys-Dowmunt, Roman Grundkiewicz, Shubha Guha, Kenneth Heafield.[pdf].

Noising and Denoising Natural Language: Diverse Backtranslation for Grammar Correction. NAACL 2018.
Ziang Xie, Guillaume Genthial, Stanley Xie, Andrew Ng, Dan Jurafsky.[pdf].

Reference-less Measure of Faithfulness for Grammatical Error Correction. NAACL 2018 short.
Leshem Choshen, Omri Abend.[pdf].

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation. NAACL 2018 short.
Roman Grundkiewicz, Marcin Junczys-Dowmunt.[pdf].

Language Model Based Grammatical Error Correction without Annotated Training Data. NAACL 2018 BEA.
Christopher Bryant, Ted Briscoe.[pdf].

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction. AAAI 2018.
Shamil Chollampatt, Hwee Tou Ng.[pdf].

Neural Quality Estimation of Grammatical Error Correction. EMNLP 2018.
Shamil Chollampatt, Hwee Tou Ng.[pdf].

Wronging a Right: Generating Better Errors to Improve Grammatical Error Detection. EMNLP 2018.
Sudhanshu Kasewa, Pontus Stenetorp, Sebastian Riedel.[pdf].

Using Wikipedia Edits in Low Resource Grammatical Error Correction. EMNLP 2018.
Adriane Boyd.[pdf].

Cool English: a Grammatical Error Correction System Based on Large Learner Corpora. COLING 2018.
Yu-Chun Lo, Jhih-Jie Chen, Chingyu Yang, Jason Chang.[pdf].

A Reassessment of Reference-Based Grammatical Error Correction Metrics. COLING 2018.
Shamil Chollampatt, Hwee Tou Ng.[pdf].

earlier

A Nested Attention Neural Hybrid Model for Grammatical Error Correction. ACL 2017.
Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao.[pdf].

Automatic Annotation and Evaluation of Error Types for Grammatical Error Correction. ACL 2017.
Christopher Bryant, Mariano Felice, Ted Briscoe.[pdf].

Neural Sequence-Labelling Models for Grammatical Error Correction. EMNLP 2017.
Helen Yannakoudakis, Marek Rei, Øistein E. Andersen, Zheng Yuan.[pdf].

JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction. EACL 2017.
Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[pdf].

Grammatical Error Detection Using Error- and Grammaticality-Specific Word Embeddings. IJCNLP 2017.
Masahiro Kaneko, Yuya Sakaizawa, Mamoru Komachi.[pdf].

Reference-based Metrics can be Replaced with Reference-less Metrics in Evaluating Grammatical Error Correction Systems. IJCNLP 2017.
Hiroki Asano, Tomoya Mizumoto, Kentaro Inui.[pdf].

Grammatical Error Correction with Neural Reinforcement Learning. IJCNLP 2017.
Keisuke Sakaguchi, Matt Post, Benjamin Van Durme.[pdf].

Grammatical Error Correction: Machine Translation and Classifiers. ACL 2016.
Alla Rozovskaya, Dan Roth.[pdf].

Compositional Sequence Labeling Models for Error Detection in Learner Writing. ACL 2016.
Marek Rei, Helen Yannakoudakis.[pdf].

Grammatical error correction using neural machine translation. NAACL 2016 short.
Zheng Yuan, Ted Briscoe.[pdf].

Discriminative Reranking for Grammatical Error Correction with Statistical Machine Translation. NAACL 2016 short.
Tomoya Mizumoto, Yuji Matsumoto.[pdf].

Phrase-based Machine Translation is State-of-the-Art for Automatic Grammatical Error Correction. EMNLP 2016.
Marcin Junczys-Dowmunt, Roman Grundkiewicz.[pdf].

Adapting Grammatical Error Correction Based on the Native Language of Writers with Neural Network Joint Models. EMNLP 2016.
Shamil Chollampatt, Duc Tam Hoang, Hwee Tou Ng.[pdf].

There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction. EMNLP 2016.
Courtney Napoles, Keisuke Sakaguchi, Joel Tetreault.[pdf].

Chinese Preposition Selection for Grammatical Error Diagnosis. COLING 2016.
Hen-Hsen Huang, Yen-Chi Shao, Hsin-Hsi Chen.[pdf].

Neural Network Translation Models for Grammatical Error Correction. IJCAI 2016.
S Chollampatt,K Taghipour,HT Ng.[pdf].

Exploiting N-Best Hypotheses to Improve an SMT Approach to Grammatical Error Correction. IJCAI 2016.
DT Hoang,S Chollampatt,HT Ng.[pdf].

How Far are We from Fully Automatic High Quality Grammatical Error Correction?. ACL 2015.
Christopher Bryant, Hwee Tou Ng.[pdf].

Ground Truth for Grammatical Error Correction Metrics. ACL 2015 short.
Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[pdf].

Towards a standard evaluation method for grammatical error detection and correction. NAACL 2015.
Mariano Felice, Ted Briscoe.[pdf].

Human Evaluation of Grammatical Error Correction Systems. EMNLP 2015.
Roman Grundkiewicz, Marcin Junczys-Dowmunt, Edward Gillian.[pdf].

Ground Truth for Grammatical Error Correction Metrics. IJCNLP 2015.
Courtney Napoles, Keisuke Sakaguchi, Matt Post, Joel Tetreault.[pdf].

Go Climb a Dependency Tree and Correct the Grammatical Errors. EMNLP 2014.
Longkai Zhang, Houfeng Wang.[pdf].

System Combination for Grammatical Error Correction. EMNLP 2014.
Raymond Hendy Susanto, Peter Phandi, Hwee Tou Ng.[pdf].

Data Driven Grammatical Error Detection in Transcripts of Children’s Speech. EMNLP 2014.
Eric Morley, Anna Eva Hallin, Brian Roark.[pdf].

Generating artificial errors for grammatical error correction. EACL 2014.
Mariano Felice, Zheng Yuan.[pdf].

Correcting Grammatical Verb Errors. EACL 2014.
Alla Rozovskaya, Dan Roth, Vivek Srikumar.[pdf].

Detecting Learner Errors in the Choice of Content Words Using Compositional Distributional Semantics. COLING 2014.
Ekaterina Kochmar, Ted Briscoe.[pdf].

A Sentence Judgment System for Grammatical Error Detection. COLING 2014.
Lung-Hao Lee, Liang-Chih Yu, Kuei-Ching Lee, Yuen-Hsien Tseng, Li-Ping Chang, Hsin-Hsi Chen.[pdf].

Automated Grammatical Error Correction for Language Learners. COLING 2014.
Joel Tetreault, Claudia Leacock.[pdf].

Grammatical Error Correction Using Integer Linear Programming. ACL 2013.
Yuanbin Wu, Hwee Tou Ng.[pdf].

Joint Learning and Inference for Grammatical Error Correction. EMNLP 2013.
Alla Rozovskaya, Dan Roth.[pdf].

Automated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation. IJCNLP 2013.
Bibek Behera, Pushpak Bhattacharyya.[pdf].

Grammatical Error Correction Using Feature Selection and Confidence Tuning. IJCNLP 2013.
Yang Xiang, Yaoyun Zhang, Xiaolong Wang, Chongqiang Wei, Wen Zheng, Xiaoqiang Zhou, Yuxiu Hu, Yang Qin.[pdf].

A Meta Learning Approach to Grammatical Error Correction. ACL 2012.
Hongsuck Seo, Jonghoon Lee, Seokhwan Kim, Kyusong Lee, Sechun Kang, Gary Geunbae Lee.[pdf].

Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation. ACL 2012.
Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, Hitoshi Nishikawa.[pdf].

Better Evaluation for Grammatical Error Correction. NAACL 2012 short.
Daniel Dahlmeier, Hwee Tou Ng.[pdf].

A Beam-Search Decoder for Grammatical Error Correction. EMNLP 2012.
Daniel Dahlmeier, Hwee Tou Ng.[pdf].

Problems in Evaluating Grammatical Error Detection Systems. COLING 2012.
Martin Chodorow, Markus Dickinson, Ross Israel, Joel Tetreault.[pdf].

The Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings. COLING 2012.
Tomoya Mizumoto, Yuta Hayashibe, Mamoru Komachi, Masaaki Nagata, Yuji Matsumoto.[pdf].

They Can Help: Using Crowdsourcing to Improve the Evaluation of Grammatical Error Detection Systems. ACL 2011.
Nitin Madnani, Martin Chodorow, Joel Tetreault, Alla Rozovskaya.[pdf].

Grammatical Error Correction with Alternating Structure Optimization. ACL 2011.
Daniel Dahlmeier, Hwee Tou Ng.[pdf].

Automated Whole Sentence Grammar Correction Using a Noisy Channel Model. ACL 2011.
Y. Albert Park, Roger Levy.[pdf].

Grammatical Error Detection for Corrective Feedback Provision in Oral Conversations. AAAI 2011.
Sungjin Lee, Hyungjong Noh, Kyusong Lee, Gary Geunbae Lee.[pdf].

Evaluating performance of grammatical error detection to maximize learning effect. COLING 2010.
Ryo Nagata, Kazuhide Nakatani.[pdf].

Datasets

dataset task # sents source language
SIGHAN 2013 CSC 350 & 974 SIGHAN Zh
SIGHAN 2014 CSC 6,526 & 526 SIGHAN Zh
SIGHAN 2015 CSC 3,174 & 550 SIGHAN Zh
OCR dataset CSC 4575 FASPell(iqiyi) Zh
HybridSet CSC 270K - Zh
NLPCC 2018 GEC GEC - NLPCC Zh
CGED GED - HSK Zh
CoNLL 2013 GEC 1,381 CONLL En
CoNLL 2014 GEC 1,312 CONLL En
JFLEG GEC 747 JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction En
NUCLE GEC 57k Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English En
Lang-8 GEC 1M+ Lang-8 En
Write&Improve+LOCNESS GEC 63,683 & 7,632 - En
MMC+PsyTAR (medica) 512 & 79 - En
brikbeck+holbrook-tagged+holbrook-missp+aspell+wikipedia (Misspelling word) 36133/6136 & 1791/1200& 531/450& 2455/1922 BBK En
TOEFL-Spell - - A Benchmark Corpus of English Misspellings and a Minimally-supervised Model for Spelling Correction En
NUC-GEC GEC 500 essays How Far are We from Fully Automatic High Quality Grammatical Error Correction? En
BEA2019 GEC 34,308 BEA En
PIE-synthetic GEC 9,000,000 Parallel iterative edit models for local sequence transduction En
clang8 GEC 2,372,119 & 114,405 & 44,830 - En,GE,RU

Systems & API

Feiying System: http://check.hfl-rc.com/
Feiying API: https://www.xfyun.cn/services/textCorrection

Other Resources

Related Articles


The above resources are only used for academic research. If there is any infringement of copyright, please contact us to delete it.

About

License:Apache License 2.0