About LabelSmoothing

Question

About LabelSmoothing

tywofxd opened this issue a year ago · comments

Describe the bug
I am not sure if this is a bug. Please tell me if I am wrong.

In deepcase/context_builder/loss.py, you have the following code:
true_dist.fill_(self.smoothing / (self.size - 2))
I am confused that why self.smoothing / (self.size - 2) is scattered over events other than the desired one.
If true_dist.sum(dim=1) should be 1, true_dist.fill_(self.smoothing / (self.size - 1)) should be the correct code.

Please tell me the consideration behind self.size - 2.

Thijs van Ede · Answer 1 · Tue Aug 01 2023 17:00:20 GMT+0800 (China Standard Time)

Yes, you are correct, this line should be true_dist.fill_(self.smoothing / (self.size - 1)). I have updated this in the github repo.

A small explanation of why this was in the code in the first place: The possible prediction points also include a NO_EVENT event, which we never want to predict. Therefore, in some older versions of the code we used the true_dist.fill_(self.smoothing / (self.size - 2)) as a scattering value and set the NO_EVENT probability to 0. However, we decided to make a generic loss function and removed the 0 probability for the NO_EVENT element but forgot to update the scattering. Given the (often) large number of events it probably won't change much in the results, but it should be corrected nevertheless.

Thank you for notifying me!