n-waves / multifit

The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Test Bert Multilingual on MLDoc

eisenjulian opened this issue · comments

Results where added in the results markdown file.

Data need to be in TSV format, and this script works to test it

git clone https://github.com/eisenjulian/bert.git
python bert/run_classifier_with_tfhub.py --task_name=MLDOC \
    --data_dir=data/mldoc/$MLDOC_LANG \
    --bert_hub_module_handle=https://tfhub.dev/google/bert_multi_cased_L-12_H-768_A-12/1 \
    --output_dir=gs://$GCP_BUCKET/$MODEL_DIR/ \
    --use_tpu=True \
    --tpu_name=grpc://$COLAB_TPU_ADDR \
    --max_seq_length=128 \
    --train_batch_size=32 \
    --learning_rate=5e-5 \
    --num_train_epochs=10.0 \
    --do_train=True \
    --do_eval=True