Thijsvanede / DeepCASE

Original implementation and resources of DeepCASE as in the S&P '22 paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About LabelSmoothing

tywofxd opened this issue · comments

Describe the bug
I am not sure if this is a bug. Please tell me if I am wrong.

In deepcase/context_builder/loss.py, you have the following code:
true_dist.fill_(self.smoothing / (self.size - 2))
I am confused that why self.smoothing / (self.size - 2) is scattered over events other than the desired one.
If true_dist.sum(dim=1) should be 1, true_dist.fill_(self.smoothing / (self.size - 1)) should be the correct code.

Please tell me the consideration behind self.size - 2.

Yes, you are correct, this line should be true_dist.fill_(self.smoothing / (self.size - 1)). I have updated this in the github repo.

A small explanation of why this was in the code in the first place: The possible prediction points also include a NO_EVENT event, which we never want to predict. Therefore, in some older versions of the code we used the true_dist.fill_(self.smoothing / (self.size - 2)) as a scattering value and set the NO_EVENT probability to 0. However, we decided to make a generic loss function and removed the 0 probability for the NO_EVENT element but forgot to update the scattering. Given the (often) large number of events it probably won't change much in the results, but it should be corrected nevertheless.

Thank you for notifying me!