CKA
zhjy2016 opened this issue · comments
zhjy2016 commented
Hello, I am very concerned about your exploration on CKA visualization, BUT I have not found the relevant auxiliary code,. May I ask if there is any way to obtain it
fenglinglwb commented
We have not released the CKA code yet. But you may refer to https://github.com/yuanli2333/CKA-Centered-Kernel-Alignment.
zhjy2016 commented
Thanks a lot!
Tiankai Hang commented
Hi, @fenglinglwb ,
It seems that in your figure, the value ranges from 0 to 1, have you done the normalization?
And for the features you use for CKA, do you mean pool the tokens? For the [6, 6, 6, 6] setting, there seems to be 4 x 6 x 4 = 96 features, but in your figure, the max layer id is 80+.
Best.
fenglinglwb commented
- We didn't perform the normalization.
- The model with the max layer id as 80+ is EDT-B, which contains 6 transformer stages instead of 4. Apart from the convolutional head and tail, we include outputs of attention and FFN after residual connections, and also the global connection in each stage. So it should be 6 * 6 * 2 + 6 + n_convs.