To use k-nucleus sampling or beam search when training on mono data?

Question

To use k-nucleus sampling or beam search when training on mono data?

Jeevesh8 opened this issue 5 years ago · comments

We use k-nucleus sampling that is differentiable. Should we use this differentiable sampling while training on monolingual data too? Or use the currently implemented beam search only?