liuyuan-pal / Gen6D

Hello,
I'm trying to evaluate the accuracy of a custom object and I noticed that the detector produces completely inaccurate results. First of all, I need to mention that I need to extract the evaluation metrics ADD-0.1d and Prj-5 for my custom object. The eval.py script requires an align.pkl file, which I extracted from the compute_align_poses.py script. Having completed all the necessary set up steps of the dataset, i run the eval.py script and i get these results.

Moreover, during the prediction process while the initial predictions are not accurate, after a while the refiner outputs better results.

Notice
The reference images are synthetic images

What should i do in your opinion, in order to solve this problem with the detector? Thank you very much for your time.

Hi, thanks for your interest in our work!

Your gt labels look fine!
The reason for incorrect eval.py is that the scale differences are too large between reference images and query images. You may downsample the query image to a smaller one (and change intrinsics accordingly), then it will be able to produce more accurate results. The reason for that is #29 (comment)

Can I tweak the scale difference to check if I'll get better results? And if yes, will this modification decrease the algorithm's performance? Also, where are the intrinsic parameters located in the code?
Moreover, if I need to train the detector, how should I structure my dataset and what kind of information is required (folders, files, etc)?

Thank you in advance, I do appreciate it!!

Yes, you can change the query image size. Changing that to find a suitable size could improve the performance.
If you resize the image here, you should change the K accordingly

Gen6D/eval.py

Line 124 in 50aa71b

K = que_database.get_K(que_id)

The training codes are already included. You may need to read the training codes and find out how to train the detector. In terms of the given object, I think the pretrained detector is OK and you just need to take care of the scale difference in the first frame. Because subsequent frames all are processed by the refiner only.

Great, Thank you for your insights!

Inaccurate detector's results