raymondhs / lang-8-process

Lang-8 preprocessing scripts

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Lang-8 Preprocessing

This repo contains preprocessing scripts for extracting English correction corpus from Lang-8 Learner Corpora (https://sites.google.com/site/naistlang8corpora/). Please use Python 3 and install the following dependencies:

pip install joblib langid nltk tqdm

About

Lang-8 preprocessing scripts


Languages

Language:Python 81.5%Language:Shell 18.5%