fraunhoferhhi / neural-deferred-shading

Multi-View Mesh Reconstruction with Neural Deferred Shading (CVPR 2022)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to determine the input bounding box size?

nepfaff opened this issue · comments

I want to do some testing on my data, but I'm struggling to find a working input bounding box size. Does someone have hints on how to find the right size?

Whenever I use a bounding box instead of an initial mesh, I get the following error:

Traceback (most recent call last):
  File "reconstruct.py", line 84, in <module>
    mesh_initial = generate_mesh(args.initial_mesh, views, AABB.load(args.input_bbox), device=device)
  File "/home/nep/robot_locomotion/neural-deferred-shading/nds/utils/geometry.py", line 355, in generate_mesh
    v, f = mesh_generators[generator_name]()
  File "/home/nep/robot_locomotion/neural-deferred-shading/nds/utils/geometry.py", line 350, in <lambda>
    'vh32': (lambda: compute_visual_hull(views, aabb, grid_size=32, device=device)),
  File "/home/nep/robot_locomotion/neural-deferred-shading/nds/utils/geometry.py", line 249, in compute_visual_hull
    return marching_cubes(voxels, voxels_occupancy, gradient_direction='ascent')
  File "/home/nep/robot_locomotion/neural-deferred-shading/nds/utils/geometry.py", line 203, in marching_cubes
    vertices, faces, normals, values = measure.marching_cubes_lewiner(voxel_occupancy.cpu().numpy(), level=0.5, spacing=spacing, **kwargs)
  File "/home/nep/.local/lib/python3.8/site-packages/skimage/measure/_marching_cubes_lewiner.py", line 276, in marching_cubes_lewiner
    return _marching_cubes_lewiner(volume, level, spacing, gradient_direction,
  File "/home/nep/.local/lib/python3.8/site-packages/skimage/measure/_marching_cubes_lewiner.py", line 302, in _marching_cubes_lewiner
    raise ValueError("Surface level must be within volume data range.")

A OneDrive download link to one of my datasets: https://1drv.ms/u/s!AjFJcUGSEjrpgcdYVbeeM28llLLshg?e=R1auVR
Preview:
views

Hi @nepfaff,

sorry for the late reply. Generally, as you noted, the bounding box is unique to your camera setup. You can think of it as an object that lives in the same space as your cameras. So, a cheap way to get a bounding box (if you know your cameras are facing inwards) is to use the bounding box spanned by your camera centers. If I remember correctly, you can find a more sophisticated way of computing the bounding box here (code from IDR). For some fixed setups the bounding box is known because you can measure it in the real world, using metric units.

Regarding your data: The general format seems correct, but I've visualized your cameras and noticed that your matrices are off. IIRC your image coordinate system was a right-handed system where the z-axis points away from the scene and the y-axis points up. This does not adhere the OpenCV convention, where the z-axis points into the scene and the y-axis points down. This can be fixed with a rotation around the x-axis.

But: even with the orientation fixed and with a reasonable bounding box, our visual hull step fails. Just from my visualization, your camera poses looked off to me and the cameras didn't seem to focus one object (the lego brick). Have you verified the camera poses? Does the reconstruction work with other frameworks?

@mworchel Thank you for your detailed reply!

This clarified the meaning of the bounding box.

I have now fixed the cameras. Nevertheless, the visual hull step still fails, as per your observations. The camera matrices come for ARKit and might not be great. Hence, I run COLMAP to get better camera poses. Yet, the visual hull step still fails with the COLMAP poses. I tried with bounding boxes of varying sizes, but even large ones don't work. Do you have any insights regarding what could cause this?

I plotted the bounding box (plotting the voxels from compute_visual_hull as point cloud points), the cameras, and the world frame:
image
image

A similar dataset to the first one but with correct OpenCV cameras from COLMAP: https://1drv.ms/u/s!AjFJcUGSEjrpgcgbbOryvfkDHN8mtA?e=A7qZKE

The same data works for me with InstantNGP (requires a data format conversion).

The same data works for me with InstantNGP (requires a data format conversion).

That's a good hint, from there we can start to investigate what's going on.

For the new dataset, the visual hull step produces something but it looks cluttered. I've had a look at the images and it seems that it's multiple datasets mixed together. I see the Jell-O object but there's also images of "Neem Oil", a keyboard and some kind of plastic basket sprinkled here and there :) For example, see images 46, 71, 79.

Fair. These are artefacts of bad masking. I did not bother removing these as I thought the initial mesh creation shouldn't fail because of them, and I wanted to test for robustness against such artefacts.

Nevertheless, I did now manually remove them, and it worked! 🎉

Looks like I did not fully understand the mesh initialization procedure. I will look into it more.

Thanks for the help!

The method (as a whole) is super sensitive to the masks. When constructing the initial mesh, we really have no other choice than relying on consistent masks. So unfortunately, it's not super robust against these artifacts :/