How to support Multilingual?
tripbnb66 opened this issue · comments
tripbnb66 commented
Currently, this model only support Chinese only.
Is it possible to support Multilingual?
If it can support Multilingual, can anyone tell me how to do?
(If you can provide detail steps to support Multilingual pre-train model, I will much apprecate)
Thank you. Best regards
tripbnb66 commented
I found the solution and shared to people who need have the problem.
- download pre-train files from bert repository https://github.com/google-research/bert
- unzip the zip file
- install pytorch-pretrained-bert by pip or pip3
- using command "pytorch_pretrained_bert convert_tf_checkpoint_to_pytorch ..." to convert
A sample is listed below:
- wget "https://storage.googleapis.com/bert_models/2018_11_23/multi_cased_L-12_H-768_A-12.zip"
- unzip multi_cased_L-12_H-768_A-12.zip
- pip3 install pytorch-pretrained-bert
- export BERT_BASE_DIR=/home/david/bot/cron/bert/multi_cased_L-12_H-768_A-12
- pytorch_pretrained_bert convert_tf_checkpoint_to_pytorch $BERT_BASE_DIR/bert_model.ckpt $BERT_BASE_DIR/bert_config.json $BERT_BASE_DIR/pytorch_model.bin
- cp -f $BERT_BASE_DIR/pytorch_model.bin bert_pytorch/pybert/pretrain/bert/base-uncased/pytorch_model.bin
- cp -f $BERT_BASE_DIR/vocab.txt bert_pytorch/pybert/pretrain/bert/base-uncased/bert_vocab.txt
- cp -f $BERT_BASE_DIR/bert_config.json bert_pytorch/pybert/pretrain/bert/base-uncased/config.json