Equation (5) - partial derivative of the Euclidean norm

Question

Equation (5) - partial derivative of the Euclidean norm

andgitchang opened this issue 4 years ago · comments

Hi,
I would like to know why you defined the L2-distance as in Equation (14) appendix.
Doesn't L2-distance need a square root outside the summations?
And also, I would like to know how the corresponding partial derivative of L2-distance in Equation (5) comes?
Thanks.

HantingChen · Answer 1 · Thu Jul 16 2020 23:39:09 GMT+0800 (China Standard Time)

We define the L2-distance to further investigate different metrics in neural networks. We still use L1 distance in AdderNets.

The partial derivative of L2-distance uses its originial derivative .

andgitchang · Answer 2 · Fri Jul 17 2020 11:07:31 GMT+0800 (China Standard Time)

I know you use L1 distance in forward pass and full-precision L2 derivative in backward optimization.
But my question is

Considering L2 distance (see Definition), don't we need an extra sqrt outside the summations of Eq.(14) in your CVPR2020 supp?
Following the def of L2 distance, shouldn't its derivative Eq.(5) in AdderNets be like \partial ||x||_2 = x / ||x||_2 ? (please refer to p-norm subsection under Examples)

If I have misunderstood anything, please correct me. Thanks.

HantingChen · Answer 3 · Fri Jul 17 2020 14:39:22 GMT+0800 (China Standard Time)

Yes, so we finally use the L1-AdderNet in our main paper. The L2-AdderNet is proposed only for investigation.
We use the L2^2 distance in fact, as defined in our supp.

andgitchang · Answer 4 · Fri Jul 17 2020 15:45:41 GMT+0800 (China Standard Time)

Thanks for your detailed explanation