About dot product approximation

Question

ohwi opened this issue 4 years ago · comments

Hi. First of all, thank you for your work!

I have a little question about the dot product approximation.

I had read issues about the dot product approximation like this and this.

In both explanations, to use Taylor approximation, student model accesses to the labeled images before and after the update.

However, according to the equation 12 in the paper, student model does not access labeled images before update.

I think the dot product-ing two vectors should be s_loss_us_old and s_loss_l_new, if I follow the variable names in your code.

I'm wondering how the code dot_product = s_loss_l_new - s_loss_l_old approximate the dot product.

Can you help me to figure out something that I am missing?

ohwi · Answer 1 · Tue Feb 23 2021 22:35:00 GMT+0800 (China Standard Time)

I've understood the equations. Thank you!