Mandarin to Taiwanese Southern Min Translation
Environment Setup
- create conda environment using
env.yml
, which includes installing my personal fork of allennlp and allennlp-models:conda create env -f env.yml
- You might need to pip uninstall -y dataclasses due to this issue regarding installing Allennlp which Python version >= 3.7
Data Setup
- at root directory of this repo, run:
bash preprocess.sh
bash download_data.sh
Training
bash train.sh training_config/default.jsonnet <model_dir>
Predict
bash predict.sh /path/to/your/model.tar.gz <input_file> <output_file>
input_file
: should be one line per sentence.model.tar.gz
: generated by allennlp at the end of training. Pretrained model can be downloaded by runningbash download_model.sh
.
Serving
first install allennlp-server:
git clone https://github.com/allenai/allennlp-server.git
cd allennlp-server
pip install .
then run allennlp serve:
allennlp serve --archive-path model.tar.gz --predictor seq2seq --field-name source
example of http posting:
curl --header "Content-Type: application/json" \
--request POST \
--data '{"source":"你好嗎"}' \
http://127.0.0.1:8000/predict