-
Reference Paper: Spelling Error Correction with Soft-Masked BERT
-
Dataset: The data that we will use for this project will be 20 popular books from Project Gutenberg.
pip install -r requirements.txt
The length of each sentence is between 4 and 200. So,
- max_len = 32
- min_len = 2
- Prepare Data:
python data_prepare.py
- Process Data:
python data_process.py
- Train Models:
python train.py
- Test Models:
python test.py