D-Mad / ComfyUI-DSINE

(CVPR 2024) Rethinking Inductive Biases for Surface Normal Estimation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ComfyUI node for DSINE Surface Normal Estimation

image

Original repo:

Rethinking Inductive Biases for Surface Normal Estimation

Official implementation of the paper

Rethinking Inductive Biases for Surface Normal Estimation

CVPR 2024 (to appear)

Gwangbin Bae and Andrew J. Davison

[paper.pdf] [arXiv (coming soon)] [project page]

Abstract

Despite the growing demand for accurate surface normal estimation models, existing methods use general-purpose dense prediction models, adopting the same inductive biases as other tasks. In this paper, we discuss the inductive biases needed for surface normal estimation and propose to (1) utilize the per-pixel ray direction and (2) encode the relationship between neighboring surface normals by learning their relative rotation. The proposed method can generate crisp — yet, piecewise smooth — predictions for challenging in-the-wild images of arbitrary resolution and aspect ratio. Compared to a recent ViT-based state-of-the-art model, our method shows a stronger generalization ability, despite being trained on an orders of magnitude smaller dataset.

Getting Started

Start by installing the dependencies.

conda create --name DSINE python=3.10
conda activate DSINE

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
python -m pip install geffnet
python -m pip install glob2

Then, download the model weights from this link and save it under ./checkpoints/.

Test on images

  • Run python test.py to generate predictions for the images under ./samples/img/. The result will be saved under ./samples/output/.
  • Our model assumes known camera intrinsics, but providing approximate intrinsics still gives good results. For some images in ./samples/img/, the corresponding camera intrinsics (fx, fy, cx, cy - assuming perspective camera with no distortion) is provided as a .txt file. If such a file does not exist, the intrinsics will be approximated, by assuming $60^\circ$ field-of-view.

About

(CVPR 2024) Rethinking Inductive Biases for Surface Normal Estimation

License:Other


Languages

Language:Python 100.0%