Error initializing and restonring PolyRNN

Question

Error initializing and restonring PolyRNN

ennauata opened this issue 6 years ago · comments

Hello, I have tried running the demos for polyrnn++ following the instructions in the readme however, I keep getting an error when trying to restore polyrnn metagraph. I would appreciate any help on this, more details regarding the error is the following:

Code:
#Initializing and restoring PolyRNN++
model = PolygonModel(PolyRNN_metagraph, polyGraph)
model.register_eval_fn(lambda input_: evaluator.do_test(evalSess, input_))
polySess = tf.Session(config=tf.ConfigProto(
allow_soft_placement=True
), graph=polyGraph)
model.saver.restore(polySess, PolyRNN_checkpoint)

Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
1 #Initializing and restoring PolyRNN++
----> 2 model = PolygonModel(PolyRNN_metagraph, polyGraph)
3 model.register_eval_fn(lambda input_: evaluator.do_test(evalSess, input_))
4 polySess = tf.Session(config=tf.ConfigProto(
5 allow_soft_placement=True

/media/nelson/Workspace1/Projects/building_reconstruction/polyrnn/src/PolygonModel.py in init(self, meta_graph_path, graph)
30 self.saver = None
31 self.eval_pred_fn = None
---> 32 self._restore_graph(meta_graph_path)
33
34 def _restore_graph(self, meta_graph_path):

/media/nelson/Workspace1/Projects/building_reconstruction/polyrnn/src/PolygonModel.py in _restore_graph(self, meta_graph_path)
34 def _restore_graph(self, meta_graph_path):
35 with self.graph.as_default():
---> 36 self.saver = tf.train.import_meta_graph(meta_graph_path, clear_devices=False)
37
38 def _prediction(self):

/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.pyc in import_meta_graph(meta_graph_or_file, clear_devices, import_scope, **kwargs)
1925 clear_devices=clear_devices,
1926 import_scope=import_scope,
-> 1927 **kwargs)
1928 if meta_graph_def.HasField("saver_def"):
1929 return Saver(saver_def=meta_graph_def.saver_def, name=import_scope)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/meta_graph.pyc in import_scoped_meta_graph(meta_graph_or_file, clear_devices, graph, import_scope, input_map, unbound_inputs_col_name, restore_collections_predicate)
739 importer.import_graph_def(
740 input_graph_def, name=(import_scope or ""), input_map=input_map,
--> 741 producer_op_list=producer_op_list)
742
743 # Restores all the other collections.

/usr/local/lib/python2.7/dist-packages/tensorflow/python/util/deprecation.pyc in new_func(*args, **kwargs)
430 'in a future version' if date is None else ('after %s' % date),
431 instructions)
--> 432 return func(*args, **kwargs)
433 return tf_decorator.make_decorator(func, new_func, 'deprecated',
434 _add_deprecated_arg_notice_to_docstring(

/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/importer.pyc in import_graph_def(graph_def, input_map, return_elements, name, op_dict, producer_op_list)
678 'Input types mismatch (expected %r but got %r)'
679 % (', '.join(dtypes.as_dtype(x).name for x in input_types),
--> 680 ', '.join(x.name for x in op._input_types))))
681 # pylint: enable=protected-access
682

ValueError: graph_def is invalid at node u'GatherTree': Input types mismatch (expected 'int32, int32, int32, int32' but got 'int32, int32, int32').

Amlan Kar · Answer 1 · Thu Apr 05 2018 01:32:14 GMT+0800 (China Standard Time)

Hi, could you mention your OS/Tensorflow Version? We could then try and reproduce this error.

ennauata · Answer 2 · Thu Apr 05 2018 01:49:17 GMT+0800 (China Standard Time)

Thanks for the quick response, I believe the problem is the tensorflow version which I am using (tf 1.7) and my OS is 16.04.1-Ubuntu. I noticed one of the requirements is tf 1.3 which does not support my current cuda 9.0 (I guess).

Amlan Kar · Answer 3 · Thu Apr 05 2018 02:29:27 GMT+0800 (China Standard Time)

Yes, that is most probably the issue. Closing this issue now. Feel free to re-open if it doesn't work with tf1.3 and cuda-8. Meanwhile, as soon as one of us finds time, we'll try to fix the errors on the newer versions of tensorflow (we already kind of have an idea about what these are)

Chris Rapson · Answer 4 · Wed Jun 27 2018 06:57:40 GMT+0800 (China Standard Time)

You now have 5 out of 11 issues related to supporting tf1.3. May I suggest keeping one of them open to show that you intend to work on it? Or open a new one with a more specific title e.g. "support new tensorflow versions". Maybe in the meantime do a quick patch to PolygonModel.__init__() to check the tf version and print a helpful error message? A nice error message would've saved me quite a bit of debugging time. (I'd be happy to do a PR for the error message if you like.)

Amlan Kar · Answer 5 · Wed Jun 27 2018 07:42:37 GMT+0800 (China Standard Time)

Thanks for the nice suggestion! Yes, a PR would be much appreciated :)

Amlan Kar · Answer 6 · Wed Jun 27 2018 07:45:01 GMT+0800 (China Standard Time)

The main problem isn't solvable through a PR since it requires us to compile new model files which use the new tf.gather api, which we will have to do internally, but the error message would be great! Ideally it should work for <=1.3 and >= some version that I am not sure of.

Chris Rapson · Answer 7 · Wed Jun 27 2018 12:52:31 GMT+0800 (China Standard Time)

Something else I realised while making the PR: in my case, I'm running on a PC that doesn't have a GPU (terrible, I know) so the tensorflow-gpu==1.3.0 in requirements.txt didn't do anything. When I first tried ./src/demo_inference.sh I got an error message that tensorflow was missing. So I pip installed it, and it defaulted to the most recent version. Otherwise it probably would've just worked out of the box. Do you want me to add tensorflow==1.3.0 to the requirements.txt, or would that be a problem for the majority of users which probably have a GPU?

Amlan Kar · Answer 8 · Wed Jun 27 2018 12:57:45 GMT+0800 (China Standard Time)

Yes, you are right. The clean way to solve this would be to have an install.py that calls pip from within that automatically decides whether to install tensorflow or tensorflow-gpu. Since the expectation is that the current version would be run on a gpu, we did not change much. Maybe the readme line for cpu could be extended to say change the line in requirements.txt to tensorflow==1.3.0