Implementation of Asymmetric Clipping

Question

Implementation of Asymmetric Clipping

SiyaoNickHu opened this issue 4 years ago · comments

Thanks for such an interesting paper 👍

In the paper's equation (4), asymmetric probability shifting is p_m = max(p-m, 0), but in the implementation, it's called asymmetric clipping and there is xs_neg = (xs_neg + self.clip).clamp(max=1) which is probably p_m = min(p+m, 1).

Is there a reason for this difference?

Tal · Answer 1 · Wed Oct 14 2020 13:31:20 GMT+0800 (China Standard Time)

Notice the definition in the code:

        self.xs_pos = torch.sigmoid(logits)
        self.xs_neg = 1.0 - self.xs_pos

so basically:
xs_pos=p
xs_neg=1-p

this is done to prevent calculating again and again (1-p) along the code...

Siyao Hu · Answer 2 · Wed Oct 14 2020 22:19:46 GMT+0800 (China Standard Time)

Oh I see. Nice trick!

Thanks for the quick response. I'll close the issue now.