Can't load checkpoint

Question

Can't load checkpoint

nguyennampfiev opened this issue 5 years ago · comments

Hello,
I want to run with new data from checkpoint follow your command in the README.md. When i run from scratch, it train and test ok. But i try to use your checkpoint to train with new data, it can not to load your checkpoint. Is your checkpoint different with current model? The below is the log when i try to run. I am not change any parameter and architecture in config file.
"tensorflow.python.framework.errors_impl.InvalidArgumentError: Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Assign requires shapes of both tensors to match. lhs shape= [1280,1024] rhs shape= [768,1024]"
Nam

MaybeShewill-CV · Answer 1 · Wed Nov 27 2019 10:59:39 GMT+0800 (China Standard Time)

@nguyennampfiev That's caused by the mismatch between current model and uploaded ckpt weights:)

nguyennampfiev · Answer 2 · Wed Nov 27 2019 11:29:15 GMT+0800 (China Standard Time)

@MaybeShewill-CV So ckpt weights of you are not adapt with the current architecture. So i can't re-use your weight to fast training with new data.

MaybeShewill-CV · Answer 3 · Wed Nov 27 2019 13:43:13 GMT+0800 (China Standard Time)

@nguyennampfiev new model has uploaded into model file. You may check it there:)

nguyennampfiev · Answer 4 · Wed Nov 27 2019 16:42:11 GMT+0800 (China Standard Time)

@MaybeShewill-CV Sorry i don't understand your answer,

new model has uploaded into model file --> that mean the model you used difference with saved ckpt weights or you have updated new model in source code.
I have cloned new source code but when i try to run, I have same issue when load checkpoint. I dont see the new update in your source code.

MaybeShewill-CV · Answer 5 · Wed Nov 27 2019 18:26:37 GMT+0800 (China Standard Time)

@nguyennampfiev Model weights file can be found here https://github.com/MaybeShewill-CV/attentive-gan-derainnet/tree/master/model/derain_gan

nguyennampfiev · Answer 6 · Wed Nov 27 2019 23:26:56 GMT+0800 (China Standard Time)

@MaybeShewill-CV I have successfully loaded weights by change the shape of input image to 256x384 instead of 320x 480.

rd · Answer 7 · Wed Jan 29 2020 02:29:58 GMT+0800 (China Standard Time)

I seem to have the same issue:

tensorflow.python.framework.errors_impl.NotFoundError: Key derain_net/attentive_rnn_loss/attentive_inference/residual_block_1/block_0_conv_1/b not found in checkpoint

I also tried to train the model from scratch with:

`python data_provider/data_feed_pipline.py --dataset_dir data/training_data_example/ --tfrecords_dir data/training_data_example/tfrecords

python tools/train_model.py --dataset_dir data/training_data_example/`

but if I do so I get:

I0128 10:24:42.362184 14212 train_model.py:262] Training from scratch Traceback (most recent call last): File "tools/train_model.py", line 339, in <module> train_model(args.dataset_dir, weights_path=args.weights_path) File "tools/train_model.py", line 272, in train_model encoding='latin1').item() File "/u/home/systems/dauria/.conda/envs/tensorflow_gpu_py3.6/lib/python3.6/site-packages/numpy/lib/npyio.py", line 384, in load fid = open(file, "rb") FileNotFoundError: [Errno 2] No such file or directory: './data/vgg16.npy'

Where can I get the file:

vgg16.npy

@nguyennampfiev can you please clarify how you were able to successfully load the weights by changing the shape of the image? Do you mean that by changing the size of the input image (presumably: data/test_data/test_1.png) you were able to run:

python tools/test_model.py --weights_path model/derain_gan/derain_gan.ckpt-100000 --image_path data/test_data/test_1.png

?

nguyennampfiev · Answer 8 · Wed Jan 29 2020 16:56:23 GMT+0800 (China Standard Time)

@rdauria you can download "vgg16.npy" by searching file by name on internet. https://github.com/MarvinTeichmann/tensorflow-fcn/issues/36 for example. After that you can put in "data" folder by following the structure in guideline.
About training model,

For continuing training, The input size for pretrained model of @MaybeShewill-CV you need to resize your image to 256*384. You can trace the graph for inference the input image size.
For training from scratch or with other dataset, you can set arbitrary image size after download "vgg16.npy".