Question related to make_sampling_loss function

Question

Question related to make_sampling_loss function

Shaluols opened this issue 3 years ago · comments

Hi,

It may sound stupid to ask this question, but I really need help in implementing the sampling training process. Currently, I can train the sampling part neural network with the default loss function (I was using mode="epe", top_n=1) successfully. However, according to the paper and code instructions located around the make_sampling_loss function, I need to change the mode and top-n value during training. My question is how to update the mode and top-n parameters in the loss function? Should I use placeholders for the mode and top-n and then use feed_dict to parse the real value during epoch loops? I tested this method but got errors about no gradients can be computed in the loss function, which I think is related to the if ... else.. conditions in the loss function. In this case, do I need to separate the make_sampling_loss function into several functions for different mode input? I would appreciate it a lot if you can give me some guidance!

Osama Makansi · Answer 1 · Fri Feb 05 2021 00:28:11 GMT+0800 (China Standard Time)

The simplest way to do it as follows:

Start training with mode=epe-all (regardless of the top_n value) for some epochs (e.g, say 50) and save a snapshot.
Then continue training by loading the snapshot and change the mode=epe-top-n and top_n=10 (supposed that you have 20 hypotheses at the beginning) and train for 50 epochs.
Continue doing the same until you training the last step with mode=epe and top_n=1.

Alternatively, you can do the following:

epoch_loss_type_dict = ['epe-all', 'epe-top-10', 'epe-top-5', 'epe-top-2', 'epe', 'epe']
for epoch in range(1, total_epochs + 1):
train_loss(....., loss_type=epoch_loss_type_dict[int((epoch - 1) / 50)])

Note that in this case, you need to parse the strings as follows:
'epe-all' to two parameters mode='epe-all' and top_n=1
'epe-top-10' to two parameters mode='epe-top-n' and top_n=10
'epe-top-5' to two parameters mode='epe-top-n' and top_n=5
'epe-top-2' to two parameters mode='epe-top-n' and top_n=2
'epe' to two parameters mode='epe' and top_n=1

Hope this helps. In case of more questions, please let me know.
Best,

Shaluols · Answer 2 · Fri Feb 05 2021 18:33:16 GMT+0800 (China Standard Time)

Hi Osama,

Thank you very much for your solutions! May I ask another question?
Is it necessary to add two parameters (the mode and the top-n) in the loss function? Based on my understanding, the modes of "epe-all" and "epe" are variants of "epe-top-n" mode, can we just use the "epe-top-n" mode and compute the loss only based on the top-n value. In this case, when we set top-n to 20, the "epe-top-n" is actually doing the "epe-all" job, and when the top-n is 1, the "epe-top-n" is doing the "epe" job.

Best,
Sha

Osama Makansi · Answer 3 · Fri Feb 05 2021 18:36:26 GMT+0800 (China Standard Time)

Yes, you can do that. It will do the same thing.

Best,