bourdakos1 / Custom-Object-Detection

Custom Object Detection with TensorFlow

Home Page:https://medium.freecodecamp.org/tracking-the-millenium-falcon-with-tensorflow-c8c86419225e

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

InvalidArgumentError: Assign requires shapes of both tensors to match.

Kongsea opened this issue · comments

I only created the tfrecord files using my own dataset and changed num_classes in faster_rcnn_resnet101.config accordingly.

Then when I run the code, it raised the following error:

Caused by op u'save_1/Assign_815', defined at:
File "object_detection/train.py", line 198, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/train.py", line 194, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/Custom-Object-Detection/object_detection/trainer.py", line 281, in train
keep_checkpoint_every_n_hours=keep_checkpoint_every_n_hours)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1218, in init
self.build()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1227, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1263, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 751, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 439, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 160, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 276, in assign
validate_shape=validate_shape)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 57, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1470, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [584] rhs shape= [8]
[[Node: save_1/Assign_815 = Assign[T=DT_FLOAT, _class=["loc:@SecondStageBoxPredictor/BoxEncodingPredictor/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](SecondStageBoxPredictor/BoxEncodingPredictor/biases/Momentum, save_1/RestoreV2_815)]]

It seems the model was restored failed. Besides, my own dataset has 146 classes, so it seems 584 = 146 * 4 is not equal the original 2 classes * 4 = 8.

Hmm, did you create new tf records first? Maybe it’s still pointing at the old one?

I have deleted all the old tf records and created the new ones for my dataset.
Finally, I found it's because I missed to change the class number in some place. Now it's OK after I change it.
Thank you.

Awesome :)

commented

@Kongsea Hi, Kongsea, I got the exact same problem like yours. WOuld you mind to let me know where else we need to change the class# beside .config file?
Thank you

Search the original class number 2 and corresponding bbox coordinates number 8 [ 2*4 ] and replace the two numbers with numbers corresponding to your dataset.

commented

Hi, thank you for the quick response. What are the specific variable names? I didn't find original_class_number or bbox_coordinates_number. Thank you.

I mean to search the number 2 and 8, and replace them respectively. I am sorry I cannot remeber the specific parameters, so you need to search them yourself.
Be careful to replace the numbers related to the class number and the bbox coordinates number only.

TF creates a "checkpoint" file. There might be one provided with the frozen inference graph, try deleting that and it should fix the problem.

@gingerhead22 @Kongsea were you able to find the specific parameters that you had to update? If so, can you list them?