AIVFI / Monocular-Depth-Estimation-Rankings-and-2D-to-3D-Video-Conversion-Rankings

Rankings include: BetterDepth ChronoDepth Depth Any Video Depth Anything Depth Pro DepthCrafter DPT FutureDepth GBDMF GenPercept GeoWizard LeReS LightedDepth Marigold Metric3D MiDaS MonST3R NeWCRFs NVDS NVDS+ PatchFusion StereoCrafter UniDepth ZoeDepth | Waiting list include: Align3R Buffer Anytime FiffDepth MegaSaM MoGe RollingDepth

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Monocular Depth Estimation Rankings
and 2D to 3D Video Conversion Rankings

List of Rankings

Each ranking includes only the best model for one method.

Monocular Depth Estimation Rankings

  1. DA-2K (mostly 1500×2000): Acc (%)>=86
  2. UnrealStereo4K (3840×2160): AbsRel<=0.04
  3. MVS-Synth (1920×1080): AbsRel<=0.06
  4. HRSD (1920×1080): AbsRel<=0.08
  5. Middlebury2021 (1920×1080): SqRel<=0.5
  6. NYU-Depth V2 (640×480): OPW<=0.31
  7. NYU-Depth V2 (640×480): AbsRel<=0.058

2D to 3D Video Conversion Rankings

I. Video Inpainting Rankings

  • (to do)

II. Light Field Video Reconstruction from Monocular Video Rankings

  1. 👑 4DLFVD with up to 10×10 real light field views✔️: LPIPS😍 (no data)
    This will be the King of all rankings. We look forward to ambitious researchers.
  2. 4DLFVD with up to 10×10 real light field views✔️: PSNR😞 (no data)
  3. Hybrid with 7×7 synthetic light field views✖️: LPIPS😍 (no data)
  4. Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

Appendices


DA-2K (mostly 1500×2000): Acc (%)>=86

RK     Model      Acc (%) ↑ 
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1 Depth Anything V2 Giant
CVPR
ENH:
arXiv
Backbone:
DINOv2 (ViT-G/14)
97.4 {1}
arXiv
Pretraining: BlendedMVS & Hypersim & IRS & TartanAir & VKITTI 2
Training: BDD100K & Google Landmarks & ImageNet-21K & LSUN & Objects365 & Open Images V7 & Places365 & SA-1B
GitHub Stars
ENH:
GitHub Stars
- -
2 GeoWizard
arXiv
Backbone:
Stable Diffusion v2
88.1 {1}
arXiv
Hypersim & Replica & 3D Ken Burns & Objaverse & proprietary GitHub Stars - -
3 Marigold
CVPR
Backbone:
Stable Diffusion v2
86.8 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -

Back to Top Back to the List of Rankings

UnrealStereo4K (3840×2160): AbsRel<=0.04

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0388 {1}
CVPR
ENH:
UnrealStereo4K
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

MVS-Synth (1920×1080): AbsRel<=0.06

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 ZoeDepth +PFR=128
arXiv
ENH:
CVPR
0.0589 {1}
CVPR
ENH:
MVS-Synth
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

HRSD (1920×1080): AbsRel<=0.08

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 DPT-B + R + AL
ICCV
ENH:
CVPRW
0.074 {1}
CVPRW
ENH:
HRSD
GitHub Stars
ENH:
-
- -

Back to Top Back to the List of Rankings

Middlebury2021 (1920×1080): SqRel<=0.5

RK     Model       SqRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LeReS-GBDMF
CVPR
ENH:
AAAI
0.444 {1}
AAAI
ENH:
HR-WSI
GitHub Stars
ENH:
GitHub Stars
- -

Back to Top Back to the List of Rankings

NYU-Depth V2 (640×480): OPW<=0.31

RK     Model       OPW ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 FutureDepth
arXiv
Backbone:
Swin-L
0.303 {4}
arXiv
NYU-Depth V2 - - -

Back to Top Back to the List of Rankings

NYU-Depth V2 (640×480): AbsRel<=0.058

RK     Model       AbsRel ↓  
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
Vapour-
Synth
1-2 BetterDepth
arXiv
Backbone:
Depth Anything & Marigold
0.042 {1}
arXiv
Hypersim & Virtual KITTI - - -
1-2 Metric3D v2 CSTM_label
ICCV
ENH:
arXiv
Backbone:
DINOv2 with registers (ViT-L/14)
0.042 {1}
arXiv
DDAD & Lyft & Driving Stereo & DIML & Arogoverse2 & Cityscapes & DSEC & Mapillary PSD & Pandaset & UASOL & Virtual KITTI & Waymo & Matterport3d & Taskonomy & Replica & ScanNet & HM3d & Hypersim GitHub Stars - -
3 Depth Anything Large
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.043 {1}
CVPR
Pretraining: BlendedMVS & DIML & HR-WSI & IRS & MegaDepth & TartanAir
Training: BDD100K & Google Landmarks & ImageNet-21K & LSUN & Objects365 & Open Images V7 & Places365 & SA-1B
GitHub Stars - -
4 MiDaS v3.1 BEiTL-512
TPAMI
ENH:
arXiv
Backbone:
BEiT512-L (ViT-L/16)
0.048 {1}
CVPR
Pretraining: ReDWeb & HR-WSI & BlendedMVS & NYU-Depth V2 & KITTI
Training: ReDWeb & DIML & 3D Movies & MegaDepth & WSVD & TartanAir & HR-WSI & ApolloScape & BlendedMVS & IRS & NYU-Depth V2 & KITTI
GitHub Stars - PyTorch
GitHub Stars
5 GeoWizard
arXiv
Backbone:
Stable Diffusion v2
0.052 {1}
arXiv
Hypersim & Replica & 3D Ken Burns & Objaverse & proprietary GitHub Stars - -
6 Marigold
CVPR
Backbone:
Stable Diffusion v2
0.055 {1}
CVPR
Hypersim & Virtual KITTI GitHub Stars - -
7 GenPercept
arXiv
Backbone:
Stable Diffusion v2.1
0.056 {1}
arXiv
Hypersim & Virtual KITTI GitHub Stars - -
8 NeWCRFs + LightedDepth
CVPR
ENH:
CVPR
0.057 {2}
CVPR
ENH:
NYU-Depth V2
GitHub Stars
ENH:
GitHub Stars
- -
9 UniDepth-V
CVPR
Backbone:
DINOv2 (ViT-L/14)
0.0578 {1}
CVPR
A2D2 & Argoverse2 & BDD100k & CityScapes & DrivingStereo & Mapillary PSD & ScanNet & Taskonomy & Waymo GitHub Stars - -

Back to Top Back to the List of Rankings

Hybrid with 7×7 synthetic light field views✖️: PSNR😞>=32dB

RK     Model        PSNR ↑   
{Input fr.}
Training
dataset
Official
  repository  
Practical
model
VapourSynth
1 LFVRT
ECCV
MDE: DPT
ICCV
Backbone:
ViT
32.66 {3+1D}
ECCV
GoPro & TAMULF GitHub Stars
MDE:
GitHub Stars
- -

📝 Note: The above ranking includes only one model, as the other methods are image-based and don't have any temporal information making them unsuitable for light field video reconstruction from monocular video.

Back to Top Back to the List of Rankings

Appendix 3: List of all research papers from the above rankings

Method Paper     Venue    
BetterDepth BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation arXiv
Depth Anything Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data CVPR
Depth Anything V2 Depth Anything V2 arXiv
DPT Vision Transformers for Dense Prediction ICCV
FutureDepth FutureDepth: Learning to Predict the Future Improves Video Depth Estimation arXiv
GBDMF Multi-Resolution Monocular Depth Map Fusion by Self-Supervised Gradient-Based Composition AAAI
GenPercept Diffusion Models Trained with Large Data Are Transferable Visual Models arXiv
GeoWizard GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image arXiv
LeReS Learning to Recover 3D Scene Shape from a Single Image CVPR
LightedDepth LightedDepth: Video Depth Estimation in light of Limited Inference View Angles CVPR
LFVRT Synthesizing Light Field Video from Monocular Video ECCV
Marigold Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation CVPR
Metric3D Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image ICCV
Metric3D v2 Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation arXiv
MiDaS Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer TPAMI
MiDaS v3.1 MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation arXiv
NeWCRFs Neural Window Fully-connected CRFs for Monocular Depth Estimation CVPR
PatchFusion PatchFusion: An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation CVPR
R + AL High-Resolution Synthetic RGB-D Datasets for Monocular Depth Estimation CVPRW
UniDepth UniDepth: Universal Monocular Metric Depth Estimation CVPR
ZoeDepth ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth arXiv

Back to Top Back to the List of Rankings

About

Rankings include: BetterDepth ChronoDepth Depth Any Video Depth Anything Depth Pro DepthCrafter DPT FutureDepth GBDMF GenPercept GeoWizard LeReS LightedDepth Marigold Metric3D MiDaS MonST3R NeWCRFs NVDS NVDS+ PatchFusion StereoCrafter UniDepth ZoeDepth | Waiting list include: Align3R Buffer Anytime FiffDepth MegaSaM MoGe RollingDepth