zhangboshen / A2J

Code for paper "A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image". ICCV2019

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem while retraining A2J on NYU

Tekicho opened this issue · comments

Hello,
My appreciations to your great work! While trying to reproduce A2J by retraining on NYU, I am getting "inf" for regression loss. My environment is :
Windows10, cuda10.1, cudnn7.6.1, pytorch1.5.1.
I am using the same hyperparameters as used in nyu.py. It seems that you are using pretrained Resnet50 in a finetunning mode with none of its weights freezed, I am right? Your help is extremly appreciated!

Hi, @Tekicho , yes, we use imagenet-pretrained resnet-50 as in our training code. And I am also confused why you get a inf error..., cause this error did not happened when we training the model.

There seems to be a logical error at line 151 in src_train/anchor.py:
regression_loss += regression_diff_depth.mean()
I think it should be:

regression_loss += regression_loss_depth.mean()

@zhangboshen, can you please share the latest update of the source code for anchor.py and nyu.py for training?

Finally solved the problem by setting:
torch.backends.cudnn.enabled = False