questions about the sampling strategy for baseline model

Question

questions about the sampling strategy for baseline model

kelvinleen opened this issue 6 years ago · comments

in your paper, you have said: 【Also, to compare the diversity introduced by the stochasticity in the proposed latent variable versus the softmax of RNN at each decoding step, we generate N responses from the baseline by sampling from the softmax. For CVAE/kgCVAE, we sample N times from the latent z and only use greedy decoders so that the randomness comes entirely from the latent variable z.】

the tradictional beam search with size B have two step, first for each beam, generate top-B words from the vocab-softmax,
then generate top-B beams from the B*B candidate sequences using the average probability.
Is the sampling in the above figure means two multinomial step for the inner vocabulary softmax and the outer average probability?
And is the inner sampling with replacement or without replacement? Is the outer sampling with replacement or without replacement?

Tiancheng Zhao (Tony) · Answer 1 · Mon Mar 12 2018 22:30:47 GMT+0800 (China Standard Time)

"we generate N responses from the baseline by sampling from the softmax." means at each decoding step, we sample a word from the softmax, and we feed the word into to the next decoding step. We repeat this until we hit EOS token. No beam search is involved/