KexianHust / ViTA

ViTA: Video Transformer Adaptor for Robust Video Depth Estimation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Modifying demo to use image sequences instead of video

vitacon opened this issue · comments

Hello,
I use MiDaS in my current project and it takes a sequence of PNGs as input and the result is also saved as grayscale PNGs.
I expected ViTA to save both input and output frames to a temporary folder from where I could copy them but it does not seem to happen and all frames are kept in RAM...?

I suppose changing the code to use PNGs instead of MP4 should be rather simple but unfortunately I am no Python expert. =]
Can you give me some hints?
I suppose the main loop that goes through all frames (img_inputs) starts at line 133, right?

for i in range(0, img_num, seq_len - overlap * 2):

I have added a function to enable our model take a image sequence as input. You can have a try!

Thanks for the modification! =)

However, it is not quite there for me yet:

I use MiDaS in my current project and it takes a sequence of PNGs as input and the result is also saved as grayscale PNGs.

  1. It seems ViTA does not support MiDaS switch "--grayscale" and it always uses "Inferno" instead
  2. Even with "--format imgs", the result is always saved as MP4 and not PNGs (it would be nice to keep the original names of the images too)
  3. Right now the output video made from PNGs does not have a proper name and the output file is called just ".mp4"

For the first question, you can add the following behind the Line 308:

cv2.imwrite(os.path.join(output_path, 'frame_%04d.png' % (i + 1)), cv2.resize(predictions[i], dsize=(img.shape[1], img.shape[0]), interpolation=cv2.INTER_LINEAR))

For the second and the third question, you just need to put your folder under the 'input_imgs', e.g., input_imgs/test/, then the result will be saved as 'test.mp4'.

Thanks! 👍

Actually, I don't need the output video, so I commented all videoWriter stuff out and I kept just this:

    for i in range(predictions.shape[0]):
        print("  exporting ", img_names[i])
        cv2.imwrite(os.path.join(output_path, img_names[i]), cv2.resize(predictions[i], dsize=(img.shape[1], img.shape[0]), interpolation=cv2.INTER_LINEAR))

However, I think an additional argument might be useful for other people...

And I'm glad it was worth it - MiDaS versus ViTA. =)

328-depth-side-by-side.mp4

And I'm glad it was worth it - MiDaS versus ViTA. =)

328-depth-side-by-side.mp4

Glad to hear that!

One more detail that might be useful to someone else - I use sometimes Czech letters in names of my folders and ViTA could not handle that.
Apparently it is a known problem related to Unicode in cv2.imread and imwrite so I had to replace those lines:

READ

        # image = cv2.imread(img_namef)
        image = cv2.imdecode(np.fromfile(img_namef, dtype=np.uint8), cv2.IMREAD_UNCHANGED)

WRITE

        # cv2.imwrite(os.path.join(output_path, img_names[i]), cv2.resize(predictions[i], dsize=(img.shape[1], img.shape[0]), interpolation=cv2.INTER_LINEAR))
        is_success, im_buf_arr = cv2.imencode(".png", cv2.resize(predictions[i], dsize=(img.shape[1], img.shape[0]), interpolation=cv2.INTER_LINEAR))
        im_buf_arr.tofile(os.path.join(output_path, img_names[i]))