hritam-98 / image-to-3D

Latest advancements in image (single or multiple) to 3D object generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Isotropic3D arXiv


  • We propose a novel image-to-3D pipeline called Isotropic3D that takes only an image CLIP embedding as input. Isotropic3D aims to give full play to 2D diffusion model priors without requiring the target view to be utterly consistent with the input view
  • We introduce a view-conditioned multi-view diffusion model that integrates Explicit Multi-view Attention (EMA), aimed at enhancing view generation through fine-tuning. EMA combines noisy multi-view images with the noisefree reference image as an explicit condition. Such a design allows the reference image to be discarded from the whole network during the SDS-based 3D generation process
  • Experiments demonstrate that with a single CLIP embedding, Isotropic3D can generate promising 3D assets while still showing similarity to the reference image.

Method overview

Comparison with other works

Key Takeaways

  • Single CLIP embedding to entire 3D generation
  • Mulitview Diffusion model what takes the reference image and the noisy rendered 2D images
  • A single loop to actually make the generation better

GaussianObject arXiv


  • We propose to optimize 3D Gaussians from highly sparse views with explicit structure priors, where several techniques are designed, including the visual hull for initialization and floater elimination for training.
  • A Gaussian repair model based on diffusion models is proposed to remove artifacts caused by omitted or highly compressed object information, where the rendering quality can be further improved.
  • The overall framework GaussianObject shows strong performance on several challenging real-world datasets, consistently outperforming previous state-of-the-art methods for both qualitative and quantitative evaluation.

Method overview

Comparison with other works

Key Takeaways

  • Propose visual hull for coarse point cloud generation from 4 reference images
  • Gaussian repair module and distance aware sampling
  • 2D diffusion model and SDS loss to refine the initialized gaussians using gaussian rasterization for 2D rendering

DreamGaussian arXiv

alt text

InstantMesh arXiv

LRM (Large Reconstruction Model) arXiv


  • EXTREMELY FAST (5 second for single image to 3D generation)

TripoSR (Large Reconstruction Model) arXiv

alt text


  • EXTREMELY FAST (<1 second for single image to 3D generation)

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation arXiv

alt text

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation arXiv

alt text

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images arXiv

alt text

MVControl: Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting arXiv

alt text

DMV3D: Denoising Multi-view Diffusion with 3D LRM arXiv

alt text

Wonder3D: Single Image to 3D using Cross-Domain Diffusion arXiv

alt text

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion arXiv

alt text

M-LRM: Multi-view Large Reconstruction Model arXiv

alt text

Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers arXiv

alt text

CRM: Convolutional Reconstruction Model

Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting arXiv

alt text

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation arXiv

alt text

MVDream: Multi-view Diffusion for 3D Generation arXiv

alt text

ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation arXiv

alt text

Envision3D: One Image to 3D with Anchor Views Interpolation arXiv

alt text

Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion arXiv

alt text

Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation arXiv


Latest advancements in image (single or multiple) to 3D object generation