Chung-I / mandarin_to_tsm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mandarin to Taiwanese Southern Min Translation

Environment Setup

  • create conda environment using env.yml, which includes installing my personal fork of allennlp and allennlp-models:
    • conda create env -f env.yml
  • You might need to pip uninstall -y dataclasses due to this issue regarding installing Allennlp which Python version >= 3.7

Data Setup

  • at root directory of this repo, run:
bash preprocess.sh
bash download_data.sh

Training

bash train.sh training_config/default.jsonnet <model_dir>

Predict

bash predict.sh /path/to/your/model.tar.gz <input_file> <output_file>
  • input_file: should be one line per sentence.
  • model.tar.gz: generated by allennlp at the end of training. Pretrained model can be downloaded by running bash download_model.sh.

Serving

first install allennlp-server:

git clone https://github.com/allenai/allennlp-server.git
cd allennlp-server
pip install .

then run allennlp serve:

allennlp serve --archive-path model.tar.gz  --predictor seq2seq --field-name source

example of http posting:

curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"source":"你好嗎"}' \
  http://127.0.0.1:8000/predict

About


Languages

Language:Python 42.4%Language:Jsonnet 34.8%Language:Shell 22.9%