word dropout or word embedding dropout?

Question

word dropout or word embedding dropout?

mengxuehu opened this issue 6 years ago · comments

According to the papar, "We do this by randomly replacing some fraction of the conditioned-on word tokens with the generic unknown word token unk", but it seems that this implementation is dropping out word embedding.

Nicholas Roberts · Answer 1 · Wed Aug 15 2018 07:53:49 GMT+0800 (China Standard Time)

According to this line, dropout is being applied to the sequence of embeddings after they have been looked up in the embedding.

I'm not sure if this is right, because I think that this has the effect of randomly dropping elements of the embedding vector corresponding to a token rather than dropping whole embedding vectors (post lookup). Instead, whole vectors along the sequence dimension should be dropped out.

Furthermore, I'm pretty sure that the 0 vector doesn't necessarily correspond to the unk token since the embeddings are learned.

Tim Baumgärtner · Answer 2 · Wed Aug 15 2018 14:42:00 GMT+0800 (China Standard Time)

Right, this is different from the paper. Unfortunately, I don't have the time to fix this right now. But feel free to do a PR. If you do, please also keep the dropout version as an option.

nguyenvo09 · Answer 3 · Tue Sep 04 2018 06:14:36 GMT+0800 (China Standard Time)

How about testing phase? We do need to disable dropout right? Based on current implementation, dropout still works in the testing phase.

Tim Baumgärtner · Answer 4 · Tue Sep 04 2018 14:06:11 GMT+0800 (China Standard Time)

This is done here

# Enable/Disable Dropout
if split == 'train':
    model.train()
else:
    model.eval()