deterministic-algorithms-lab / Back-2-Back-Translation

An attempt to make Back-Translation differentiable, using probability weighted embeddings for predicted translations in the nucleus of the predicted distribution over target language sentences.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

To use k-nucleus sampling or beam search when training on mono data?

Jeevesh8 opened this issue · comments

We use k-nucleus sampling that is differentiable. Should we use this differentiable sampling while training on monolingual data too? Or use the currently implemented beam search only?