FangGet / tf-monodepth2

Tensorflow implementation(unofficial) of "Digging into Self-Supervised Monocular Depth Prediction"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not able to train

indhu26 opened this issue · comments

Hi all,

Conda env details -

cudatoolkit 9.0
cudnn 7.1.2
tensorflow-gpu 1.6.0

Data Preprocessing -

Followed the same command what was mentioned in the repo
0000000001

While training facing an error - not able to find why is it caused
Or Am i missing something else @FangGet ?

Attaching the error below
Traceback (most recent call last):
File "monodepth2.py", line 63, in
args.func(config, output_dir, args)
File "monodepth2.py", line 17, in _cli_train
monodepth2_learner.train(output_dir)
File "/home/DEPTH_MODEL/tf-monodepth2/model/monodepth2_learner.py", line 405, in train
self.save(sess, ckpt_dir, 'latest')
File "/home/anaconda3/envs/tf_mono/lib/python3.6/contextlib.py", line 99, in exit
self.gen.throw(type, value, traceback)
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 1000, in managed_session
self.stop(close_summary_writer=close_summary_writer)
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/training/supervisor.py", line 828, in stop
ignore_live_threads=ignore_live_threads)
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
six.reraise(*self._exc_info_to_raise)
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/six.py", line 703, in reraise
raise value
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/training/queue_runner_impl.py", line 252, in _run
enqueue_callable()
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1259, in _single_operation_run
None)
File "/home/anaconda3/envs/tf_mono/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Expected size[1] in [0, 608], but got 640
[[Node: data_loading/Slice = Slice[Index=DT_INT32, T=DT_UINT8, _device="/job:localhost/replica:0/task:0/device:CPU:0"](data_loading/DecodeJpeg, data_loading/Slice_4/begin, data_loading/Slice_3/size)]]

I believe it is an issue of resolution mismatch. You can first try by checking what resolution you used to process the data and what size of resolution you use for training.

i got the same problem could you give me some advice
my preprocess img is 1248 *128 is correct then i dont change the size image_height: 192
image_width: 640 in the yaml