liuyuan-pal / Gen6D

[ECCV2022] Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Replacement of the default detector

EvdoTheo opened this issue · comments

Hello @liuyuan-pal ,
I have a trained YOLOv8 model that detects the needed object perfectly and I would like to replace the default detector with my trained one. In short, I know that your model crops the area where the object is located and then continues with calculations, so, how can I exploit the bounding box that the yolov8 gives me in order to increase the efficiency of the detector? Can I replace the default detector completely or can I use both detectors to get better results?

I noticed that inside the 'estimator.py' script the detector function is located. Could you please clarify what the variables "position" and "scale_r2q" mean for every que image?

Screenshot from 2023-08-28 12-32-16

Thank you in advance!

Hi, the detection position means the center of the bounding box. The scale means a ratio between the detected bounding box size and the given reference image size (which is 128 in Gen6D). You may just transfer your detection results into these two values. Meanwhile, I suggest that you resize the bounding box a little bit larger to leave some margin in the bounding box because we assume the bounding box encloses the projection of the unit sphere that the object is inscribing to.

Thank you for the insights, I comprehended the purpose of those variables! I'll close the issue and open it again if I face any problems. I appreciate it!

Greetings @liuyuan-pal,
after I changed the variables detection position and scale_r2q with the detected bbox from YOLO and the scale between the detected bbox and the referenced image. Although, I'm still getting inaccurate results. I checked that the cropped image is correct.
cropped_screenshot_07 09 2023

but the bounding boxes are incorrect ( there is a minor problem with the colour pallet because of the different libraries used for the image).
18-bbox

Aside from the above-mentioned variables do I need to tweak any other parameters in order to get better results? I'm confused about the validity of the scaling calculation being done. I note that the input image resolution is 640x480, so, i set the parameter --resolution default=640, should I change it to 960 or not? Do you think i should train the selector further? Thank you for your contribution!

Hi, maybe you can show me your intermediate results so I can figure out the problem here.

Hi, maybe you can show me your intermediate results so I can figure out the problem here.

Sorry, there is an error in the visualize_intermediate_results function, As soon as I solve it I'll get back with the intermediate results.