run image_quality_prediction.py shape erro

Question

run image_quality_prediction.py shape erro

Usernamezhx opened this issue 3 years ago · comments

thanks for your work. it is very cool. I test jpg image with size 1919 × 1440. it will show me that:

tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [1,661,32] vs. [1,193,32]
	 [[node model/tri_q_image_quality_transformer/add_1 (defined at /data2/zhx3/triq/src/models/transformer_iqa.py:198) ]] [Op:__inference_predict_function_9663]

Errors may have originated from an input operation.
Input Source operations connected to node model/tri_q_image_quality_transformer/add_1:
 model/tri_q_image_quality_transformer/concat (defined at /data2/zhx3/triq/src/models/transformer_iqa.py:194)

Function call stack:
predict_function

Junyong You · Answer 1 · Tue Feb 23 2021 19:16:13 GMT+0800 (China Standard Time)

This is because the input image has very large resolution, much lager than the imageset that the model has been trained on. If you look at the method create_triq_model in triq_model.py, maximum_position_encoding=193, and the position encoding of your image is 661. A quick and dirty way is to increase this argument to 661, for example. But I am not sure if it will work, as the model has not been trained with such big images. Give a try and please let me know what you get.

zhanghx · Answer 2 · Tue Feb 23 2021 20:18:36 GMT+0800 (China Standard Time)

I think it can't work:

Traceback (most recent call last):
  File "examples/image_quality_prediction.py", line 24, in <module>
    predict_mos = predict_image_quality(model_weights_path, image_path)
  File "examples/image_quality_prediction.py", line 13, in predict_image_quality
    model.load_weights(model_weights_path)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 250, in load_weights
    return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1266, in load_weights
    hdf5_format.load_weights_from_hdf5_group(f, self.layers)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 707, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3384, in batch_set_value
    x.assign(np.asarray(value, dtype=dtype(x)))
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 846, in assign
    self._shape.assert_is_compatible_with(value_tensor.shape)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 1117, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (1, 661, 32) and (1, 193, 32) are incompatible

The max size may be :1024 × 768. ok I will random crop image to 1024 × 768 and get the mean mos value.

Junyong You · Answer 3 · Tue Feb 23 2021 21:25:11 GMT+0800 (China Standard Time)

I think it can't work:

Traceback (most recent call last):
  File "examples/image_quality_prediction.py", line 24, in <module>
    predict_mos = predict_image_quality(model_weights_path, image_path)
  File "examples/image_quality_prediction.py", line 13, in predict_image_quality
    model.load_weights(model_weights_path)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 250, in load_weights
    return super(Model, self).load_weights(filepath, by_name, skip_mismatch)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/engine/network.py", line 1266, in load_weights
    hdf5_format.load_weights_from_hdf5_group(f, self.layers)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 707, in load_weights_from_hdf5_group
    K.batch_set_value(weight_value_tuples)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3384, in batch_set_value
    x.assign(np.asarray(value, dtype=dtype(x)))
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 846, in assign
    self._shape.assert_is_compatible_with(value_tensor.shape)
  File "/data2/zhx3/env_python3.7_pytorch1.5/lib/python3.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 1117, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (1, 661, 32) and (1, 193, 32) are incompatible

The max size may be :1024 × 768. ok I will random crop image to 1024 × 768 and get the mean mos value.

Then it might be the only way for now, even though image patching and average is the approach that I want to avoid. The model can be trained by setting maximum_position_encoding to larger value when using small resolution images. However, I am not sure if that will work because the positional weights beyond 193 will probably not be trained well.

Junyong You · Answer 4 · Sun Mar 21 2021 04:31:38 GMT+0800 (China Standard Time)

There is another approach to solve the problem. You can change line 146 in transformer_iqa.py by increasing the pooling size. For example, you can change to self.pooling_small = MaxPool2D(pool_size=(4, 4)) or even larger.

zhanghx · Answer 5 · Sun Mar 21 2021 20:25:04 GMT+0800 (China Standard Time)

ok. I will try. thanks for your reply.

Junyong You · Answer 6 · Mon Mar 22 2021 22:56:20 GMT+0800 (China Standard Time)

ok. I will try. thanks for your reply.

If possible, please also let me know your results. I am very curious. Thanks.