liuzhuang13 / slimming

Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

local subgradient = S*torch.sign(weight)

MrLinNing opened this issue · comments

L1 sparsity should be torch.abs(weight), can you detail more about it?
local subgradient = S*torch.sign(weight)

The (sub)gradient of absolute value function (L1 sparsity loss) is the sign function. Here we compute the subgradient directly without defining loss.

thank you! @liuzhuang13
Why you used subgradient? did you try directly defining loss ?

Because absolute value function is not differentiable at point x=0, so it is subgradient instead of gradient. But in practice, the weight x never becomes 0 so it is actually equivalent to gradient.

Unlike Pytorch, in Torch there is no automatic differentiation, so I found this to be the most convenient way to do the thing we wanted, and we just used it.