replicate experiments in "Neural Architectures for Named Entity Recognition"

Question

replicate experiments in "Neural Architectures for Named Entity Recognition"

boliangz opened this issue 7 years ago · comments

Thank you for sharing the tagger with the community. I'm trying to replicate the training procedure in the paper. Using the provided English model, I could get f1 score 93.86% on CoNLL testa set and 90.43% on testb set which are almost the same performance as they are reported in the paper. But I couldn't get the same results when training a model on my own. Following the provided model, I set the parameters as:

all_emb=true
cap_dim=0
char_bidirect=True
char_dim=25
char_lstm_dim=25
crf=True
dropout=0.5
lower=False
lr_method=sgd-lr.005
pre_emb=glove.6B.100d.txt
tag_scheme=iobes
word_bidirect=True
word_dim=100
word_lstm_dim=100
zeros=True

Training epoch is set to 100. The difference here is the pre-trained embeddings where I use Stanford Glove embedding. With this setting, I got 91.45% f1 on testa and 88.23% f1 on testb. Do you think the gap is due to the different pre-trained embeddings? Would you share your pre-trained embeddings if possible?

Guillaume Lample · Answer 1 · Fri Nov 24 2017 04:19:41 GMT+0800 (China Standard Time)

Hi,

Yes, the pre-trained embeddings can make a significant difference. Here you can find a link to the embeddings we used for the 4 languages:
#44

Boliang Zhang · Answer 2 · Fri Nov 24 2017 06:27:16 GMT+0800 (China Standard Time)

thanks! I got the same results now.