NVIDIA / vid2vid

Pytorch implementation of our method for high-resolution (e.g. 2048x1024) photorealistic video-to-video translation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to resume training on colab upon session timeout?

kartikJ-9 opened this issue · comments

Decent results require 3k-5k frames. My GPU session on colab gets disconnected due to usage while training. I am saving the checkpoints in the drive. Is there any way I can resume the training from a particular epoch? I have a sequence of images obtained from a video. I am new to PyTorch. Somebody suggested saving the weights of the epoch and continuing from that checkpoint.

i also have same issue.Plz help!!

I need some help about this too, because I use the flag --continue_train and --which__epoch but no matter what number I pass, the training begins from epoch 1.