作者你好，有关于性能方面想要咨询一下您。

Question

作者你好，有关于性能方面想要咨询一下您。

BlueGhostZ opened this issue 3 years ago · comments

我使用了你们开源的HGCF的代码，并使用Movielens数据集上进行了实验，参数设置如下：
使用resSumGCN，num-layers=3，c=1，margin=0.1，weight-decay=1e-4，lr=0.001，momentum=0.95，scale=0.1，embedding_dim=64，norm_adj=true。但在movielens和yelp2018数据集上的性能表现非常糟糕。
具体来说，HGCF在movielens数据集上的Recall@20和NDCG@20分别为0.2538和0.3677，这一表现远低于LightGCN的0.2606和0.3793（相同设置下）。
我们调整了所有能够改动的超参数，但性能会变得更差，并会遇到NAN问题。我们对Movielens数据集采用了8：2的划分方式，这一点与论文中相同，代码其他方面均无改动。
我不知道是哪里出了问题，希望作者您能够解答我的疑惑。个人猜测可能是数据集导致的影响？但我们在Yelp2018数据集（来源于LightGCN论文）上同样进行了测试，超参数设置同上，HGCF的性能表现同样非常差，远低于LightGCN。

BlueGhostZ · Answer 1 · Thu Jul 15 2021 10:33:22 GMT+0800 (China Standard Time)

此外，我们发现在config.py文件里model_config部分有两个关于嵌入的设置：embedding_dim和dim，这两个的设置相同，请问这两个有无区别？

jianing-sun · Answer 2 · Fri Jul 16 2021 03:12:46 GMT+0800 (China Standard Time)

Hi, thanks for your interest!

You are not running any dataset we provided or used in the paper: ml100k isn't a standard benchmark in recsys because it's too small (it's not used or mentioned anywhere in our paper). And you used Yelp2018 dataset from some other source we don't know (we used Yelp2020 which is a totally different dataset, please refer to the README for more detail), I'm sorry but we don't know how to answer your questions.

jianing-sun · Answer 3 · Fri Jul 16 2021 03:42:12 GMT+0800 (China Standard Time)

此外，我们发现在config.py文件里model_config部分有两个关于嵌入的设置：embedding_dim和dim，这两个的设置相同，请问这两个有无区别？

Only embedding_dim is used.