Unclear connection of LassoNet and SPINN
andreimargeloiu opened this issue · comments
The LassoNet paper mentiones that LassoNet generalises a method, though it's unclear how/when this is the case.
In Section 1.2 Related work, the paper says "Recently, Feng and Simon (2017) proposed an input-sparse neural network, where the input weights are penalized using the group Lasso penalty. As will become evident in Section 3, our proposed method extends and generalizes this approach in a natural way."
The Feng and Simon (2017) add a sparse group Lasso on the first layer (see figure below), which is a convex combination of a Lasso and a group Lasso.
How/When does LassoNet generalize the method of Feng and Simon (2017)? Looking in Section 3, I see that LassoNet is equivalent to a standard Lasso (when M=0) and an unregularized feed-forward neural network (when M → +∞); though the connection to the method of Feng and Simon (2017) isn't mentioned.
For large M our method is pretty close, with just an additional skip connection!
Cc'ing @ilemhadri
How is LassoNet large M pretty to Feng and Simon (2017)? Can you please explain?
Feng and Simon (2017) put a Lasso and Group Lasso constraints of the first layer, but LassoNet with large M doesn't put any Lasso constraints on the MLP. As I understand, for large M LassoNet does:
- put L1 constraint on the skip connection
- doesn't put any constraint on the linear layer of the MLP because
$\left|W_j^{(1)}\right|_{\infty} \leq M\left|\theta_j\right|$
Let's say intermediate M then.
The point is that the first layer does have a L1 penalty and is not too constrained by linear signals.
Lassonet is not more general than the other paper though.
Are you referring to the L1 penalty on the skip connection? I don't see how the first layer of the MLP has an L1 penalty.
Is the corresponding L1 penalty
Explanation:
The absolute value of any coordinate of
Thank you very much for all the back-and-forth explanation!