Performance sensitive to Pytorch and CUDA environment.

Question

Performance sensitive to Pytorch and CUDA environment.

darkbird1 opened this issue 2 years ago · comments

Hi, thanks for such an interesting work. I observed that the model performs well and good with pytorch 1.8 . However, when I try doing the same in latest pytorch 1.12 and CUDA 11.7, the same model starts overfitting on the training data, and test MRAE does not go down below 0.42. Similarily in pytorch 1.2, the training MRAE starts oscillating around 0.5 and does not change much. Does it mean that the performance of model is highly sensitive to the pytorch or cuda version??

Yuanhao Cai · Answer 1 · Sun Aug 28 2022 21:56:00 GMT+0800 (China Standard Time)

Thanks for your appreciation. This is because our MST++ is implemented by torch 1.8. Some operations and functions change with the pytorch version. For instance, different versions use different methods to handle the singular value. Thus, it is better to train and test our models in our suggested environment.

darkbird1 · Answer 2 · Mon Aug 29 2022 00:20:29 GMT+0800 (China Standard Time)

Thanks for the response. From my own understanding, It seems that though there are some differences in the implementation aspects of different versions (after checking the updates in different versions), the corresponding effect on the performance is significant. It will be really interesting to theoretically investigate the sensitivity of different components to address the reproducibility on different platforms. Also, thanks for providing different baseline models' weights for comparison. I will do the similar analysis for other baselines in different environments

Yuanhao Cai · Answer 3 · Mon Aug 29 2022 08:42:32 GMT+0800 (China Standard Time)

OK. Hope you find something interesting. Good luck.