xinge008 / Cylinder3D

Rank 1st in the leaderboard of SemanticKITTI semantic segmentation (both single-scan and multi-scan) (Nov. 2020) (CVPR2021 Oral)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to train Cylinder3D with mixed precision?

YJYJLee opened this issue · comments

Hello,

I am trying to train Cylinder3D with mixed precision, so I added torch.cuda.amp code to the source code.
However, I am getting NaN value due to overflow as soon as I start training. I detected NaN values in forward pass, and it is propagated in the backward pass which is causing loss to be also NaN.

Is it possible to train Cylinder3D with fp16? Is there any solution for this?
Thanks!

I do not try amp; If you want to save GPU memory, it is better to try the torch.utils.checkpoint.

commented

Hello,

I had the same error and needed to adjust the eps parameter of Adam. See reference issue. I am using Spconv-v2.1.x. Likely this is caused because spconv is somewhat independent from PyTorch.

If it still does not work try the higher spconv version (I have forked a modified implementation)