MaybeShewill-CV / attentive-gan-derainnet

Unofficial tensorflow implemention of "Attentive Generative Adversarial Network for Raindrop Removal from A Single Image (CVPR 2018) " model https://maybeshewill-cv.github.io/attentive-gan-derainnet/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't load checkpoint

nguyennampfiev opened this issue · comments

Hello,
I want to run with new data from checkpoint follow your command in the README.md. When i run from scratch, it train and test ok. But i try to use your checkpoint to train with new data, it can not to load your checkpoint. Is your checkpoint different with current model? The below is the log when i try to run. I am not change any parameter and architecture in config file.
"tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1280,1024] rhs shape= [768,1024]"
Nam

@nguyennampfiev That's caused by the mismatch between current model and uploaded ckpt weights:)

@MaybeShewill-CV So ckpt weights of you are not adapt with the current architecture. So i can't re-use your weight to fast training with new data.

@nguyennampfiev new model has uploaded into model file. You may check it there:)

@MaybeShewill-CV Sorry i don't understand your answer,

new model has uploaded into model file --> that mean the model you used difference with saved ckpt weights or you have updated new model in source code.
I have cloned new source code but when i try to run, I have same issue when load checkpoint. I dont see the new update in your source code.

@MaybeShewill-CV I have successfully loaded weights by change the shape of input image to 256x384 instead of 320x 480.

commented

I seem to have the same issue:

tensorflow.python.framework.errors_impl.NotFoundError: Key derain_net/attentive_rnn_loss/attentive_inference/residual_block_1/block_0_conv_1/b not found in checkpoint

I also tried to train the model from scratch with:

`python data_provider/data_feed_pipline.py --dataset_dir data/training_data_example/ --tfrecords_dir data/training_data_example/tfrecords

python tools/train_model.py --dataset_dir data/training_data_example/`

but if I do so I get:

I0128 10:24:42.362184 14212 train_model.py:262] Training from scratch Traceback (most recent call last): File "tools/train_model.py", line 339, in <module> train_model(args.dataset_dir, weights_path=args.weights_path) File "tools/train_model.py", line 272, in train_model encoding='latin1').item() File "/u/home/systems/dauria/.conda/envs/tensorflow_gpu_py3.6/lib/python3.6/site-packages/numpy/lib/npyio.py", line 384, in load fid = open(file, "rb") FileNotFoundError: [Errno 2] No such file or directory: './data/vgg16.npy'

Where can I get the file:

vgg16.npy

@nguyennampfiev can you please clarify how you were able to successfully load the weights by changing the shape of the image? Do you mean that by changing the size of the input image (presumably: data/test_data/test_1.png) you were able to run:

python tools/test_model.py --weights_path model/derain_gan/derain_gan.ckpt-100000 --image_path data/test_data/test_1.png

?

@rdauria you can download "vgg16.npy" by searching file by name on internet. https://github.com/MarvinTeichmann/tensorflow-fcn/issues/36 for example. After that you can put in "data" folder by following the structure in guideline.
About training model,

  • For continuing training, The input size for pretrained model of @MaybeShewill-CV you need to resize your image to 256*384. You can trace the graph for inference the input image size.

  • For training from scratch or with other dataset, you can set arbitrary image size after download "vgg16.npy".