fuxiao0719 / GeoWizard

[arXiv'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Home Page:https://fuxiao0719.github.io/projects/geowizard/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Numeric bug, and question regarding the normal ensemble logic

hubert0527 opened this issue · comments

The numeric bug
The torch.acos has numeric issues that produce nan values near the +1 and -1 values, as mentioned here: pytorch/pytorch#8069

This problem causes the ensemble_normals() function to always select the first tensor containing any nan values after the torch.acos

angle_error = torch.acos(torch.cosine_similarity(normal_pred[None], normal_preds, dim=1))

A simple solution is clamp the values by a small epsilon before feeding the tensor into torch.acos.


Question regarding the ensemble_normals implementation
Meanwhile, I am curious about the design choice of this ensemble_normals() function.
Different from ensemble_depths(), which selects per-pixel mean or median, and truly ensembles the predictions by fusing the multiple depth maps.
On the other hand, ensemble_normals() calculates an error score of each normal map prediction, selects the entirety of the normal map, and disregards the remaining normal maps.
Have you tried using mean/median reduction similar to ensemble_depths(), and having some insights leading to the current design?
Sincerely thanks!

  1. Thanks for mentioning it! We have fixed the boundary condition error now.
  2. Our intuition is that compared to depth, the variation in normal is more visually pronounced. Although mean reduction can further improve accuracy, the blocky phenomenon is more obvious. Thus we choose to select a tensor that is the closest to the average ensemble. We think it is a trade-off between accuracy and visual effect.

Thank you for the insights!