Support for Pre-trained ELMo Representations for Many Languages

Question

Support for Pre-trained ELMo Representations for Many Languages

kermitt2 opened this issue a year ago · comments

ELMo embeddings give very good result for NER, usually much better than simple RNN and better or comparable with BERT/Roberta/etc. base transformers. They also use notably less memory than transformers and manage well 3000 tokens sequences... Using them is fast for both training and labeling.

However, the currently available ELMo embeddings in TF format is very limited, we could try to support the ELMoForManyLangs format to extend the support of languages.

https://github.com/HIT-SCIR/ELMoForManyLangs

Patrice Lopez · Answer 1 · Mon Jan 23 2023 04:30:17 GMT+0800 (China Standard Time)

It is implemented in branch https://github.com/kermitt2/delft/tree/elmoformanylangs

Patrice Lopez · Answer 2 · Tue Jan 24 2023 01:43:23 GMT+0800 (China Standard Time)

Replacing embeddings generated by ELMoForManyLangs instead those from ELMo BILM-TF when concatenating with Gloves embeddings does not show improvement as compared to simple BidLSTM-CRF with Gloves.

English conll2003:

architecture	embeddings	F1-score (10-folds)
BidLSTM_CRF	gloves	91.03
BidLSTM_CRF_ELMo	gloves+ELMo BILM-TF	92.57
BidLSTM_CRF_ELMo	gloves+ELMoForManyLangs	91.10

Note: as warming and to try to stabilize the result, ELMoForManyLangs embeddings was run 3 times (this improved the f1-score from 90.87 to 91.10).

French LeMonde corpus (FTB)

architecture	embeddings	F1-score (10-folds)
BidLSTM_CRF	wikifr (fasttext)	89.45
BidLSTM_CRF_ELMo	wikifr+FrELMo (BILM-TF)	90.96
BidLSTM_CRF_ELMo	wikifr+ELMoForManyLangs	88.65

Patrice Lopez · Answer 3 · Thu Jan 26 2023 16:17:19 GMT+0800 (China Standard Time)

The embeddings are not effective, no value apparently and closing the issue.