cleinc / bts

From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Asking for some advice on KITTI 3D det usage.

Cc-Hy opened this issue · comments

commented

HELLO, good job!
I'm trying to pick one of your trained model to use in the KITTI 3D detection task in which I'll take in a image and use your model to generate a depth prediction map ,downsample it by 4 times then do 3D detection based on it . To get rid of overfitting, i'd like to fix your depth prediction model, and train the net after that. BUT I'm not very familiar with your net, so i'm asking for some advice. Here i have some questions:
1.The images in 3d det task are also part of the KITTI raw data but may be a little different. The size may be around 1242 * 375 e.g.and the cared depth is within 40m, and i want get a depth map of size 311 * 91. so can i directly use your net on these images?
2. I plan to use your net by creating the model, loading the pre-trained parameters and do the forward process. so i think i need not many lines of your code, by using
model = BtsModel(params=args)
lpg8x8, lpg4x4, lpg2x2, reduc1x1, depth_est = model(image, focal)
subsequent_results = my_model(depth_est, image)
However, i found there is a 'focal' in model forward args, and i have no idea on it, I only use a image as input. Can i avoid using it?

commented

When i directly apply the model to the image with shape 1242*375, error occurs.

image

It seems the size should be some specific number. Or the two branch may get different channel sizes in the downsample module.

@Cc-Hy Thanks for your interest in our work. Let me briefly answer your questions.

  1. Image height and width should be multiples of 32 because of downsampling in our encoder.
  2. "Focal" is the focal length of your camera which determines imaging geometry. If you train and test on images all from the same camera, you don't have to specify correct value for it. But if this is not the case, you need to get the correct value.