Eval sign flipping

Question

Eval sign flipping

devnkong opened this issue 2 years ago · comments

Hi Vijay,

Thanks for your repo!

Question: I see your doing sign flipping of eigen pos_enc during training, but it seems that you are not doing so during eval time. I understand that we want to make deterministic predictions so we don't have random flipping when evaluating it. Do you have further comments or justification for this?

Best
Kezhi

Kezhi Kong commented 2 years ago

Thanks!

Kezhi Kong · Answer 1 · Wed Jun 08 2022 14:16:38 GMT+0800 (China Standard Time)

Also do you have some reason for choosing eigen vectors with small eigen values?

Vijay Prakash Dwivedi · Answer 2 · Wed Jun 08 2022 16:08:27 GMT+0800 (China Standard Time)

Hi @devnkong, thanks for your questions.

Q: Why sign flipping is not used during eval?
A: The random sign flipping during the training is to allow the network to be invariant or independent of the choices among 2^k possibilities. By this approach then, the sign flipping is not required during eval.

Q: choosing eigen vectors with small eigen values?
A: Please refer to Section E.1.2 in https://arxiv.org/pdf/2003.00982.pdf

Best,
Vijay