FID score
guoliangq opened this issue · comments
Hi @gl0513 ,
Do you calculate FID by separately launching another python test.py
process?
No, I calculated FID by cifar_train.py
I would suggest to use test.py
instead. But actually, train.py
should not result nan. Maybe you can try test.py
again? Just use the checkpoint path on --load_path
@yueruchen Hi! Any updates on this issue? I have cloned the repo and run the cifar_train.py script with vanilla version. And still get the nan error. May I ask what environment you use for your running? For us we run it on A100 GPU and all the py packages from your requirements.txt.
For your suggestion on using test.py, I think it is necessary to include validation of the model. So that we can track the performance of the model. Would you kindly take a look and see if you could reproduce and solve this issue? Thanks a lot!
Hi @yzhwang ,
I'm unable to reproduce this nan from my side so I would encourage you to run another test.py
. You still can use it to track the performance during the training process, since train.py
will save checkpoint every epoch and test.py
will load checkpoint automatically.
Hi @yzhwang , I'm unable to reproduce this nan from my side so I would encourage you to run another
test.py
. You still can use it to track the performance during the training process, sincetrain.py
will save checkpoint every epoch andtest.py
will load checkpoint automatically.
Thanks Yueru, that is exactly what I'm doing right now.
Hi @yzhwang , I'm unable to reproduce this nan from my side so I would encourage you to run another
test.py
. You still can use it to track the performance during the training process, sincetrain.py
will save checkpoint every epoch andtest.py
will load checkpoint automatically.
Have your sloved this proplbeM? I face the same problem when I implement another code writed by pytorch.
Hi @Jamie-Cheung ,
Sorry it is not solved, the best way is still run two separate jobs.