This is work in progress
The following table shows the BETO results in the Spanish version of every task/benchamrk with links to the hyperparameter settings used in every experiment. We compare BETO (cased and uncased) with the Best Multilingual-BERT result that we found in the literature (as of October 2019) highlighting the results whenever BETO ourperform Multilingual BERT. The table also shows some alternative methods for the same tasks (not necessarily BERT-based methods). References for all methods are included below.
Task | BETO-cased | BETO-uncased | Best Multilingual Bert | Other results |
---|---|---|---|---|
XNLI | ----- | 80.15 | 78.50 [2] | 80.80 [5], 77.80 [1], 73.15 [4] |
POS | 98.97 | 98.44 | 97.10 [2] | 98.91 [6], 96.71 [3] |
PAWS-X | 89.05 | 89.55 | 90.70 [8] | |
NER-C | 87.24 | 82.67 | 87.38 [2] | 87.18 [3] |
NER-W | ----- | ----- | 92.50 [7] | |
MLDoc | 95.27 | 95.25 | 95.70 [2] | 88.75 [4] |
DepPar | ----- | ----- | 92.3/86.5 [2] | |
MLQA | ----- | ----- | 64.3/46.6 [9] | 68.0/49.8 [10] |
XQuAD | ----- | ----- | 74.30 [11] |
- [1] Original Multilingual BERT
- [2] Multilingual BERT on "Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT"
- [3] Multilingual BERT on "How Multilingual is Multilingual BERT?"
- [4] LASER
- [5] XLM (MLM+TLM)
- [6] UDPipe on "75 Languages, 1 Model: Parsing Universal Dependencies Universally"
- [7] Multilingual BERT on "Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation"
- [8] Multilingual BERT on "PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification"
- [9] Multilingual BERT on "MLQA: Evaluating Cross-lingual Extractive Question Answering"
- [10] XLM on "MLQA: Evaluating Cross-lingual Extractive Question Answering"
- [11] JointMulti Multilingual BERT on "On the Cross-lingual Transferability of Monolingual Representations"
73.15 LASER
80.8 XLM (MLM+TLM)
Best: 80.15
Detailed: experiments_XNLI.txt
Best: 98.44
Detailed: experiments_POS.txt
Best: 81.7
Detailed: experiments_NER.txt
92.5 Multilingual BERT on "Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation"
88.75 LASER
92.3/86.5 Multilingual BERT on "Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT"
89 and 90.7 Multilingual BERT on "PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification"
Best: 89.55
Detailed: experiments_PAWSX.txt