local subgradient = S*torch.sign(weight)

Question

local subgradient = S*torch.sign(weight)

MrLinNing opened this issue 6 years ago · comments

L1 sparsity should be torch.abs(weight), can you detail more about it?
local subgradient = S*torch.sign(weight)

Zhuang Liu · Answer 1 · Fri May 04 2018 04:49:40 GMT+0800 (China Standard Time)

The (sub)gradient of absolute value function (L1 sparsity loss) is the sign function. Here we compute the subgradient directly without defining loss.

NING LAM · Answer 2 · Fri May 04 2018 10:40:34 GMT+0800 (China Standard Time)

thank you! @liuzhuang13
Why you used subgradient? did you try directly defining loss ?

Zhuang Liu · Answer 3 · Fri May 04 2018 12:00:21 GMT+0800 (China Standard Time)

Because absolute value function is not differentiable at point x=0, so it is subgradient instead of gradient. But in practice, the weight x never becomes 0 so it is actually equivalent to gradient.

Unlike Pytorch, in Torch there is no automatic differentiation, so I found this to be the most convenient way to do the thing we wanted, and we just used it.