alievk / npbg

Neural Point-Based Graphics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why can't directly use .ply file provided by Scannet Dataset

ousuixin opened this issue · comments

Hello, thank you for your excellent work and the open source code!
Following the guild in readme, I can successfully fit a new scene by building the reconstruction with Agisoft Metashape Pro and then fitting descriptors.
However, when I directly use the reconstructions (eg. http://kaldir.vc.in.tum.de/scannet/v2/scans/scene0001_01/scene0000_00_vh_clean.ply) provided in ScanNet dataset, I found that I can not fitting descriptors correctlly then.
Are there some differences between the pointcloud built by Agisoft Metashape Pro and the pointcloud provided by ScanNet dataset? And how can I fit a scene in Scannet dataset such as scene0000_00 with the pointcloud provided by ScanNet dataset?
Thank you for your reply!

I finally found that the above problem was caused by the differences between the camera matrices provided by Agisoft Metashape Pro and ScanNet dataset instead of the pointcloud.

I solved problem above by edit the pose file provided by ScanNet. Just change the sign for the second and third column in all 'pose/*.txt'

For example, in '0.txt' in scene0000_00, it has the following content:

        -0.955421 **0.119616 -0.269932** 2.655830
        0.295248 **0.388339 -0.872939** 2.981598
        0.000408 **-0.913720 -0.406343** 1.368648
        0.000000 **0.000000 0.000000** 1.000000

I change the sign for the second and third column, so I get:

        -0.955421 **-0.119616 0.269932** 2.655830
        0.295248 -**0.388339 0.872939** 2.981598
        0.000408 **0.913720 0.406343** 1.368648
        0.000000 **0.000000 0.000000** 1.000000

And then, I can fitting descriptors correctlly for scene0000_00 in ScanNet dataset.
However I still want to Know why I can solve my problem by steps above. What is the meanning for sign reverse in the second and third column for a camera matrix (actually sign reverse in the second and third column for the rotation matrix)?

@ousuixin, you're right that this correction might be needed; we also reverse the sign for the second and the third column automatically when view matrices are loaded from an XML file produced by Agisoft Metashape (see L206). I believe this is done to transform the coordinate system to the OpenGL conventional format, but most likely @alievk could give a more elaborated explanation.

Correct. The ScanNet dataset provides extrinsics in OpenCV coordinate frame, +X right, +Y up, +Z forward.

However, we use OpenGL coordinate frame, which is rotation of OpenCV frame by 180 degrees around X axis, which is effectively invention of Y and Z axis.

Appreciate for your reply!