Missing layer and training workflow

Question

Missing layer and training workflow

karkawal opened this issue 3 years ago · comments

Hi,

Thank you for the work you have done. It is extremely useful and the code is very clean. I wish every paper implementation could be like this :) I have noticed three major discrepancies from what is written in the original paper. Could you explain to me if they are on purpose and if yes, what is the reasoning behind them?

I think that in neuMF architecture you are missing one linear layer between embeddings and collective layer for GMF and MLP, according to paper schema there should be one.
I am a bit lost with loading pretrained weights in MLP. I see that you offer a possibility to load pretrained GMF embeddings to MLP model. I believe that according to paper they are separate embeddings and are not mixed in the original work. Does this change provide noticable improvement?

Yihong Chen · Answer 1 · Fri Feb 05 2021 01:55:20 GMT+0800 (China Standard Time)

Hi, Sorry for the late reply. Regarding your questions

For there GMF, what you described is correct. But for NeuMF, according to Equation (12) in the paper, there is no such layer.
Good question! I think, in Section 3.4.1 of the paper, the authors talked about pre-training in NeuMF. In my experiments, I found pre-training generally help the model converge earlier though not necessarily at a better performance.

karkawal · Answer 2 · Fri Feb 05 2021 16:34:00 GMT+0800 (China Standard Time)

Hi,

Thank you for pointing that out, I have analyzed a paper and finally noticed it.
I will take that into consideration and try to recreate them.

Thank you for your answers and putting some light on those matters!