kermitt2 / delft

a Deep Learning Framework for Text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for Pre-trained ELMo Representations for Many Languages

kermitt2 opened this issue · comments

ELMo embeddings give very good result for NER, usually much better than simple RNN and better or comparable with BERT/Roberta/etc. base transformers. They also use notably less memory than transformers and manage well 3000 tokens sequences... Using them is fast for both training and labeling.

However, the currently available ELMo embeddings in TF format is very limited, we could try to support the ELMoForManyLangs format to extend the support of languages.

https://github.com/HIT-SCIR/ELMoForManyLangs

Replacing embeddings generated by ELMoForManyLangs instead those from ELMo BILM-TF when concatenating with Gloves embeddings does not show improvement as compared to simple BidLSTM-CRF with Gloves.

  • English conll2003:
architecture embeddings F1-score (10-folds)
BidLSTM_CRF gloves 91.03
BidLSTM_CRF_ELMo gloves+ELMo BILM-TF 92.57
BidLSTM_CRF_ELMo gloves+ELMoForManyLangs 91.10

Note: as warming and to try to stabilize the result, ELMoForManyLangs embeddings was run 3 times (this improved the f1-score from 90.87 to 91.10).

  • French LeMonde corpus (FTB)
architecture embeddings F1-score (10-folds)
BidLSTM_CRF wikifr (fasttext) 89.45
BidLSTM_CRF_ELMo wikifr+FrELMo (BILM-TF) 90.96
BidLSTM_CRF_ELMo wikifr+ELMoForManyLangs 88.65

The embeddings are not effective, no value apparently and closing the issue.