yinyunie / 3D-Shape-Analysis-Paper-List

A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

3D-Shape-Analysis-Paper-List

A list of papers, libraries and datasets I recently read is collected for anyone who shows interest at



Statistics: 🔥 code is available & stars >= 100  |  ⭐ citation >= 50

3D Detection & Segmentation

  • [Arxiv] Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding [Project]
  • [Arxiv] RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding [Project]
  • [CVPR2023] EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision [Project]
  • [CVPR2023] PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection [Project]
  • [Arxiv] Mask3D for 3D Semantic Instance Segmentation [github]
  • [ECCV2022] ObjectBox: From Centers to Boxes for Anchor-Free Object Detection [github]
  • [Arxiv] Masked Autoencoders for Self-Supervised Learning on Automotive Point Clouds
  • [CVPR2022] HyperDet3D: Learning a Scene-conditioned 3D Object Detector
  • [Arxiv] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Before 2022

  • [AAAI2022] AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds
  • [AAAI2022] Static-Dynamic Co-Teaching for Class-Incremental 3D Object Detection
  • [NeurIPS2021] Revisiting 3D Object Detection From an Egocentric Perspective
  • [Arxiv] Embracing Single Stride 3D Object Detector with Sparse Transformer [github]
  • [AAAI2022] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
  • [Arxiv] 3D-VField: Learning to Adversarially Deform Point Clouds for Robust 3D Object Detection
  • [Arxiv] Fast Point Transformer
  • [3DV2021] Open-set 3D Object Detection
  • [Arxiv] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection [Project]
  • [TPAMI2021] Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining
  • [Arxiv] Online Adaptation for Implicit Object Tracking and Shape Reconstruction in the Wild
  • [Arxiv] RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation [github]
  • [Arxiv] SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking [github]
  • [NeurIPS2021] Multimodal Virtual Point 3D Detection [Project]
  • [BMVC2021] 3D Object Tracking with Transformer [github]
  • [3DV2021] Learning 3D Semantic Segmentation with only 2D Image Supervision
  • [3DV2021] NeuralDiff: Segmenting 3D objects that move in egocentric videos [Project]
  • [BMVC2021] FAST3D: Flow-Aware Self-Training for 3D Object Detectors
  • [ICCV2021] Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation
  • [CORL2021] DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries [github]
  • [NeurIPS2021] Object DGCNN: 3D Object Detection using Dynamic Graphs [github]
  • [Arxiv] Improved Pillar with Fine-grained Feature for 3D Object Detection
  • [Arxiv] 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation
  • [ICCVW2021] MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation
  • [Arxiv] GSIP: Green Semantic Segmentation of Large-Scale Indoor Point Clouds
  • [Arxiv] Pix2seq: A Language Modeling Framework for Object Detection
  • [Arxiv] MVM3Det: A Novel Method for Multi-view Monocular 3D Detection
  • [ICCV2021] NEAT: Neural Attention Fields for End-to-End Autonomous Driving [github]
  • [ICCV2021] Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection
  • [ICCV2021] 4D-Net for Learned Multi-Modal Alignment
  • [ICCV2021] Active Learning for Deep Object Detection via Probabilistic Modeling [github]
  • [ICCV2021] An End-to-End Transformer Model for 3D Object Detection [Project]
  • [ICCV2021] Improving 3D Object Detection with Channel-wise Transformer
  • [ICCV2021] Voxel Transformer for 3D Object Detection
  • [CVPR2021] To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels
  • [Arxiv] M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
  • [ICCV2021] Exploring Simple 3D Multi-Object Tracking for Autonomous Driving
  • [ICCV2021] LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
  • [ICCV2021] Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks [github]
  • [ICCV2021] RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
  • [ICCV2021] Is Pseudo-Lidar needed for Monocular 3D Object detection?
  • [IROS2021] PTT: Point-Track-Transformer Module for 3D Single Object Tracking in Point Clouds [github]
  • [ICCV2021] Oriented R-CNN for Object Detection [github]
  • [ICCV2021] Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds [github]
  • [IROS2021] Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving
  • [ACMMM2021] From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder [github]
  • [ICCV2021] DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation
  • [ICCV2021] Hierarchical Aggregation for 3D Instance Segmentation [github]
  • [Arxiv] Investigating Attention Mechanism in 3D Point Cloud Object Detection [pytorch]
  • [ICCV2021] VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation [pytorch]
  • [ICCV2021] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
  • [Arxiv] Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth
  • [Arxiv] DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization
  • [ICCV2021] ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation
  • [ICCV2021] Rank & Sort Loss for Object Detection and Instance Segmentation [pytorch]
  • [Arxiv] Multi-Modality Task Cascade for 3D Object Detection [github]
  • [ACMMM2021] Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting
  • [Arxiv] Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
  • [Arxiv] Real-time 3D Object Detection using Feature Map Flow [pytorch]
  • [Arxiv] To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
  • [CVPR2021] RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
  • [Arxiv] Sparse PointPillars: Exploiting Sparsity in Birds-Eye-View Object Detection
  • [Arxiv] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection [Project]
  • [CVPR2021] 3D Spatial Recognition without Spatially Labeled 3D [Project]
  • [Arxiv] Lite-FPN for Keypoint-based Monocular 3D Object Detection
  • [TPAMI] MonoGRNet: A General Framework for Monocular 3D Object Detection
  • [Arxiv] Lidar Point Cloud Guided Monocular 3D Object Detection
  • [Arxiv] Geometry-aware data augmentation for monocular 3D object detection
  • [Arxiv] OCM3D: Object-Centric Monocular 3D Object Detection
  • [CVPR2021] Objects are Different: Flexible Monocular 3D Object Detection [github]
  • [CVPR2021] HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
  • [Arxiv] Group-Free 3D Object Detection via Transformers [pytorch]
  • [CVPR2021] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection [pytorch]
  • [CVPR2021] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [pytorch]
  • [CVPR2021] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [github]
  • [CVPR2021] Delving into Localization Errors for Monocular 3D Object Detection [github]
  • [CVPR2021] 3D-MAN: 3D Multi-frame Attention Network for Object Detection
  • [CVPR2021] LiDAR R-CNN: An Efficient and Universal 3D Object Detector [github]
  • [CVPR2021] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [pytorch]
  • [CVPR2021] M3DSSD: Monocular 3D Single Stage Object Detector
  • [CVPR2021] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
  • [Arxiv] SparsePoint: Fully End-to-End Sparse 3D Object Detector
  • [Arxiv] RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection
  • [ICRA2021] YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection [github]
  • [CVPR2021] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection [github]
  • [Arxiv] Offboard 3D Object Detection from Point Cloud Sequences
  • [CVPR2021] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution [github]
  • [Arxiv] Pseudo-labeling for Scalable 3D Object Detection
  • [Arxiv] DPointNet: A Density-Oriented PointNet for 3D Object Detection in Point Clouds
  • [Arxiv] PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection [pytorch]
  • [Arxiv] Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
  • [Arxiv] CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection
  • [Arxiv] Self-Attention Based Context-Aware 3D Object Detection [pytorch]
  • [Arxiv] Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Before 2021

  • [Arxiv] It’s All Around You: Range-Guided Cylindrical Network for 3D Object Detection
  • [Arxiv] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [Project]
  • [Arxiv] Demystifying Pseudo-LiDAR for Monocular 3D Object Detection
  • [3DV2020] PanoNet3D: Combining Semantic and Geometric Understanding for LiDAR Point Cloud Detection
  • [AAAI2021] PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection
  • [Arxiv] SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation
  • [Arxiv] 3D Object Detection with Pointformer
  • [WACV2021] CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection [pytorch]
  • [Arxiv] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [pytorch]
  • [Arxiv] Learning to Predict the 3D Layout of a Scene
  • [Arxiv] Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes [Project]
  • [Arxiv] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
  • [Arxiv] Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving
  • [NeurIPS2020] Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization
  • [NeurIPS2020] Group Contextual Encoding for 3D Point Clouds [pytorch]
  • [Arxiv] 3D Object Recognition By Corresponding and Quantizing Neural 3D Scene Representations [Project]
  • [Arxiv] A Density-Aware PointRCNN for 3D Objection Detection in Point Clouds
  • [Arxiv] Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training
  • [ECCV2020] Reinforced Axial Refinement Network for Monocular 3D Object Detection
  • [Arxiv] RUHSNet: 3D Object Detection Using Lidar Data in Real Time [pytorch]
  • [IROS2020] 3D Multi-Object Tracking: A Baseline and New Evaluation Metrics [Project][Code]
  • [ECCV2020] Virtual Multi-view Fusion for 3D Semantic Segmentation
  • [ACMMM2020] Weakly Supervised 3D Object Detection from Point Clouds
  • [ECCV2020] Weakly Supervised 3D Object Detection from Lidar Point Cloud [pytorch]
  • [ECCV2020] Kinematic 3D Object Detection in Monocular Video
  • [IROS2020] Object-Aware Centroid Voting for Monocular 3D Object Detection
  • [ECCV2020] Pillar-based Object Detection for Autonomous Driving
  • [Arxiv] Local Grid Rendering Networks for 3D Object Detection in Point Clouds
  • [Arxiv] Learning to Detect 3D Objects from Point Clouds in Real Time
  • [Arxiv] SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds
  • [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
  • [CVPR2020] FroDO: From Detections to 3D Objects
  • [CVPR2020] Physically Realizable Adversarial Examples for LiDAR Object Detection
  • [CVPR2020] Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection
  • [CVPR2020] End-to-end 3D Point Cloud Instance Segmentation without Detection
  • [CVPR2020] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
  • [CVPR2020] Structure Aware Single-stage 3D Object Detection from Point Cloud
  • [CVPR2020] Learning Depth-Guided Convolutions for Monocular 3D Object Detection [pytorch] 🔥
  • [CVPR2020] What You See is What You Get: Exploiting Visibility for 3D Object Detection
  • [CVPR2020] Density Based Clustering for 3D Object Detection in Point Clouds
  • [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
  • [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
  • [CVPR2020] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
  • [CVPR2020] MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
  • [CVPR2020] PointPainting: Sequential Fusion for 3D Object Detection
  • [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
  • [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
  • [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
  • [CVPR2020] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
  • [CVPR2020] A Hierarchical Graph Network for 3D Object Detection on Point Clouds
  • [Arxiv] H3DNet: 3D Object Detection Using Hybrid Geometric Primitives [pytorch]
  • [CVPR2020] P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds
  • [Arxiv] 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection
  • [CVPR2020] Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
  • [CVPR2020] Learning to Evaluate Perception Models Using Planner-Centric Metrics
  • [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation [pytorch]
  • [Arxiv] SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds [github]
  • [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [github]
  • [Arxiv] Finding Your (3D) Center: 3D Object Detection Using a Learned Loss
  • [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
  • [CVPR2020] 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segm
  • [CVPR2020] Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
  • [CVPR2020] OccuSeg: Occupancy-aware 3D Instance Segmentation
  • [CVPR2020] Learning to Segment 3D Point Clouds in 2D Image Space
  • [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
  • [AAAI2020] ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection
  • [Arxiv] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
  • [Arxiv] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
  • [Arxiv] SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
  • [Arxiv] 3DSSD: Point-based 3D Single Stage Object Detector
  • [Arxiv] Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation
  • [CVPR2020] ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
  • [Arxiv] A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full 6D Pose Estimators
  • [Arxiv] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
  • [Arxiv] Objects as Points [github] ⭐🔥
  • [Arxiv] RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving [github]
  • [CVPR2020] DSGN: Deep Stereo Geometry Network for 3D Object Detection [github]
  • [Arxiv] Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation
  • [Arxiv] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
  • [Arxiv] Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
  • [CVPR2020] SESS: Self-Ensembling Semi-Supervised 3D Object Detection
  • [NeurIPS2019] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
  • [NeurIPS2019] Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
  • [ICCV2019] Deep Hough Voting for 3D Object Detection in Point Clouds
  • [AAAI2020] JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds
  • [ICCV2019] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [pytorch]
  • [ICCV2019] 3D Instance Segmentation via Multi-Task Metric Learning
  • [Arxiv] Single-Stage Monocular 3D Object Detection with Virtual Cameras
  • [Arxiv] Depth Completion via Deep Basis Fitting
  • [Arxiv] Relation Graph Network for 3D Object Detection in Point Clouds
  • [CVPR2019] 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [pytorch] 🔥
  • [ICCV2019] Rescan: Inductive Instance Segmentation for Indoor RGBD Scans [C++]
  • [ICCV2019] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
  • [ICCV2019] STD: Sparse-to-Dense 3D Object Detector for Point Cloud
  • [CVPR2019] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [pytorch]
  • [Arxiv] Fast Point R-CNN
  • [Arxiv] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection [pytorch] 🔥
  • [ECCV2018] 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation [pytorch] 🔥

Shape Representation

  • [CVPR2023] Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning [github]
  • [Arxiv] Neural Vector Fields: Implicit Representation by Explicit Learning
  • [ECCV2022] NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing [Project]
  • [Arxiv] Masked Autoencoders in 3D Point Cloud Representation Learning
  • [Arxiv] NeuralODF: Learning Omnidirectional Distance Fields for 3D Shape Representation
  • [Siggraph2022] Learning Smooth Neural Functions via Lipschitz Regularization [Project]
  • [Siggraph2022] Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations [Project]
  • [Arxiv] A Level Set Theory for Neural Implicit Evolution under Explicit Flows
  • [CVPR2022] GIFS: Neural Implicit Function for General Shape Representation [Project]
  • [Arxiv] PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
  • [Arxiv] Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning
  • [Arxiv] Spelunking the Deep: Guaranteed Queries for General Neural Implicit Surfaces
  • [Arxiv] MINER: Multiscale Implicit Neural Representations
  • [Arxiv] De-rendering 3D Objects in the Wild
  • [Arxiv] Implicit Autoencoder for Point Cloud Self-supervised Representation Learning

Before 2022

  • [Arxiv] End-to-End Learning of Multi-category 3D Pose and Shape Estimation
  • [Arxiv] Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders
  • [Arxiv] Representing 3D Shapes with Probabilistic Directed Distance Fields
  • [Arxiv] Text2Mesh: Text-Driven Neural Stylization for Meshes [Project]
  • [Arxiv] PointCLIP: Point Cloud Understanding by CLIP [github]
  • [Arxiv] Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
  • [Arxiv] Gradient-SDF: A Semi-Implicit Surface Representation for 3D Reconstruction
  • [Arxiv] Intuitive Shape Editing in Latent Space
  • [NeurIPS2021] Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views [github]
  • [Arxiv] Neural Fields as Learnable Kernels for 3D Reconstruction
  • [NeurIPS2021] OctField: Hierarchical Implicit Functions for 3D Modeling [github]
  • [3DV2021] RefRec: Pseudo-labels Refinement via Shape Reconstruction for Unsupervised 3D Domain Adaptation [github]
  • [3DV2021] PolyNet: Polynomial Neural Network for 3D Shape Recognition with PolyShape Representation [Project]
  • [Arxiv] BACON: Band-limited Coordinate Networks for Multiscale Scene Representation [Project]
  • [Arxiv] UNIST: Unpaired Neural Implicit Shape Translation Network [Project]
  • [Arxiv] Representing Shape Collections with Alignment-Aware Linear Models [Project]
  • [ICCV2021] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
  • [Arxiv] DeepCurrents: Learning Implicit Representations of Shapes with Boundaries
  • [3DV] AIR-Nets: An Attention-Based Framework for Locally Conditioned Implicit Representations [github]
  • [Arxiv] HyperCube: Implicit Field Representations of Voxelized 3D Models
  • [Arxiv] ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators
  • [ICCV2021] Multiresolution Deep Implicit Functions for 3D Shape Representation
  • [ICCV2021] Learning Canonical 3D Object Representation for Fine-Grained Recognition
  • [Arxiv] Point Discriminative Learning for Unsupervised Representation Learning on 3D Point Clouds
  • [Arxiv] A Deep Signed Directional Distance Function for Object Shape Representation
  • [Arxiv] 3D Neural Scene Representations for Visuomotor Control [Project]
  • [Arxiv] A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation [Project]
  • [Arxiv] ShapeMOD: Macro Operation Discovery for 3D Shape Programs [Project]
  • [Arxiv] CoCoNets: Continuous Contrastive 3D Scene Representations [Project]
  • [Arxiv] DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates [Project]

Before 2021

  • [CVPR2021] clDice-a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation [github]
  • [CVPR2021] Point2Skeleton: Learning Skeletal Representations from Point Clouds [pytorch]
  • [Arxiv] ParaNet: Deep Regular Representation for 3D Point Clouds
  • [Arxiv] Geometric Adversarial Attacks and Defenses on 3D Point Clouds [tensorflow]
  • [Arxiv] Learning Category-level Shape Saliency via Deep Implicit Surface Networks
  • [Arxiv] pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
  • [Arxiv] Deep Implicit Templates for 3D Shape Representation
  • [NeurIPS2020] MetaSDF: Meta-learning Signed Distance Functions [Project]
  • [Arxiv] RISA-Net: Rotation-Invariant Structure-Aware Network for Fine-Grained 3D Shape Retrieval [tensorflow]
  • [Arxiv] Overfit Neural Networks as a Compact Shape Representation
  • [Arxiv] DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry [Project]
  • [Arxiv] PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
  • [Arxiv] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations
  • [Arxiv] ROCNET: RECURSIVE OCTREE NETWORK FOR EFFICIENT 3D DEEP REPRESENTATION
  • [ECCV2020] GeLaTO: Generative Latent Textured Objects [Project]
  • [ECCV2020] Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
  • [Arxiv] Neural Sparse Voxel Fields
  • [CVPR2020] StructEdit: Learning Structural Shape Variations [github]
  • [Arxiv] PAI-GCN: Permutable Anisotropic Graph Convolutional Networks for 3D Shape Representation Learning [github]
  • [CVPR2020] Learning Generative Models of Shape Handles [Project page]
  • [CVPR2020] DualSDF: Semantic Shape Manipulation using a Two-Level Representation [github]
  • [CVPR2020] Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image [pytorch]
  • [NeurIPS2019] Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations [pytorch]
  • [Arxiv] Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
  • [Arxiv] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
  • [Arxiv] Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
  • [Arxiv] SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates
  • [CVPR2020] D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
  • [Arxiv] Implicit Geometric Regularization for Learning Shapes
  • [Arxiv] Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
  • [Arxiv] Adversarial Generation of Continuous Implicit Shape Representations [pytorch]
  • [Arxiv] A Novel Tree-structured Point Cloud Dataset For Skeletonization Algorithm Evaluation [dataset]
  • [CVPRW2019] SkelNetOn 2019: Dataset and Challenge on Deep Learning for Geometric Shape Understanding [project]
  • [Arxiv] Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts
  • [Arxiv] InSphereNet: a Concise Representation and Classification Method for 3D Object
  • [Arxiv] Deep Structured Implicit Functions
  • [CVIU] 3D articulated skeleton extraction using a single consumer-grade depth camera
  • [ICLR2019] Point Cloud GAN [tensorflow]
  • [ICCV2019] Learning Shape Templates with Structured Implicit Functions
  • [ICCV2019] 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions [pytorch]
  • [ICCV2019] Implicit Surface Representations as Layers in Neural Networks
  • [CVPR2019] DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation [pytorch] 🔥 ⭐
  • [SIGGRAPH2019] StructureNet: Hierarchical Graph Networks for 3D Shape Generation [pytorch]
  • [SIGGRAPH Asia2019] LOGAN: Unpaired Shape Transform in Latent Overcomplete Space [tensorflow]
  • [TOG] Voxel Cores: Efficient, robust, and provably good approximation of 3D medial axes
  • [SIGGRAPH2018] P2P-NET: Bidirectional Point Displacement Net for Shape Transform [tensorflow]
  • [ICML2018] Learning Representations and Generative Models for 3D Point Clouds [tensorflow] 🔥⭐
  • [NeurIPS2018] Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning [tensorflow][project page]:star::fire:
  • [AAAI2018] Unsupervised Articulated Skeleton Extraction from Point Set Sequences Captured by a Single Depth Camera
  • [3DV2018] Parsing Geometry Using Structure-Aware Shape Templates
  • [SIGGRAPH2017] GRASS: Generative Recursive Autoencoders for Shape Structures [pytorch] 🔥
  • [TOG] Erosion Thickness on Medial Axes of 3D Shapes
  • [Vis Comput] Distance field guided L1-median skeleton extraction
  • [CGF] Contracting Medial Surfaces Isotropically for Fast Extraction of Centred Curve Skeletons
  • [CGF] Improved Use of LOP for Curve Skeleton Extraction
  • [SIGGRAPH Asia2015] Deep Points Consolidation [C++ & Qt]
  • [SIGGRAPH2015] Burning The Medial Axis
  • [SIGGRAPH2009] Curve Skeleton Extraction from Incomplete Point Cloud [matlab] ⭐
  • [TOG] SDM-NET: deep generative network for structured deformable mesh
  • [TOG] Robust and Accurate Skeletal Rigging from Mesh Sequences 🔥
  • [TOG] L1-medial skeleton of point cloud [C++] 🔥
  • [EUROGRAPHICS2016] 3D Skeletons: A State-of-the-Art Report 🔥
  • [SGP2012] Mean Curvature Skeletons [C++] 🔥
  • [SMIC2010] Point Cloud Skeletons via Laplacian-Based Contraction [Matlab] 🔥

Shape & Scene Completion

  • [ECCV2022] CompNVS: Novel View Synthesis with Scene Completion
  • [ECCV2022] PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation [Project]
  • [Arxiv] SRPCN: Structure Retrieval based Point Completion Network
  • [ICRA2022] Temporal Point Cloud Completion with Pose Disturbance
  • [Arxiv] Towards realistic symmetry-based completion of previously unseen point clouds [github]

Before 2022

  • [AAAI2022] Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective
  • [AAAI2022] Attention-based Transformation from Latent Features to Point Clouds
  • [Arxiv] MonoScene: Monocular 3D Semantic Scene Completion [Project]
  • [Arxiv] Semi-supervised Implicit Scene Completion from Sparse LiDAR [github]
  • [NeurIPS2021] Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion [github]
  • [Arxiv] PU-Transformer: Point Cloud Upsampling Transformer
  • [BMVC2021] Self-Supervised Point Cloud Completion via Inpainting
  • [IROS2021] Graph-Guided Deformation for Point Cloud Completion
  • [IROS2021] Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds [github]
  • [Arxiv] 3D Point Cloud Completion with Geometric-Aware Adversarial Augmentation
  • [Arxiv] PC2-PU: Patch Correlation and Position Correction for Effective Point Cloud Upsampling
  • [ICCV2021] Voxel-based Network for Shape Completion by Leveraging Edge Generation [github]
  • [ICCV2021] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [github]
  • [ICCV2021] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer [github]
  • [Arxiv] CarveNet: Carving Point-Block for Complex 3D Shape Completion
  • [IJCAI2021] IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement
  • [CVPR2021] Point Cloud Upsampling via Disentangled Refinement [github]
  • [TVCG2021] Consistent Two-Flow Network for Tele-Registration of Point Clouds [Project]
  • [Arxiv] 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [Project]
  • [CVPR2021] Unsupervised 3D Shape Completion through GAN Inversion [Project]
  • [Arxiv] ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion
  • [CVPR2021] Variational Relational Point Completion Network [Project]
  • [CVPR2021] View-Guided Point Cloud Completion
  • [CVPR2021] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop [pytorch]
  • [CVPR2021] Denoise and Contrast for Category Agnostic Shape Completion
  • [CVPR2021] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding
  • [CVPR2021] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [CVPR2021] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
  • [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds

Before 2021

  • [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [Arxiv] S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds
  • [Arxiv] Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data
  • [Arxiv] Learning-based 3D Occupancy Prediction for Autonomous Navigation in Occluded Environments
  • [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [3DV2020] SCFusion: Real-time Incremental Scene Reconstruction with Semantic Completion
  • [Arxiv] Refinement of Predicted Missing Parts Enhance Point Cloud Completion [pytorch]
  • [Arxiv] Unsupervised Partial Point Set Registration via Joint Shape Completion and Registration
  • [Arxiv] LMSCNet: Lightweight Multiscale 3D Semantic Completion [Demo]
  • [ECCV2020] SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
  • [ECCV2020] Weakly-supervised 3D Shape Completion in the Wild
  • [Arxiv] Point Cloud Completion by Learning Shape Priors
  • [Arxiv] KAPLAN: A 3D Point Descriptor for Shape Completion
  • [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds
  • [Arxiv] SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
  • [Arxiv] GRNet: Gridding Residual Network for Dense Point Cloud Completion
  • [Arxiv] Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion
  • [CVPR2020] Point Cloud Completion by Skip-attention Network with Hierarchical Folding
  • [CVPR2020] Cascaded Refinement Network for Point Cloud Completion [github]
  • [CVPR2020] Anisotropic Convolutional Networks for 3D Semantic Scene Completion [github]
  • [AAAI2020] Attention-based Multi-modal Fusion Network for Semantic Scene Completion
  • [CVPR2020] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior [github]
  • [ECCV2020] Multimodal Shape Completion via Conditional Generative Adversarial Networks [pytorch]
  • [CVPR2020] RevealNet: Seeing Behind Objects in RGB-D Scans
  • [CVPR2020] Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
  • [CVPR2020] PF-Net: Point Fractal Network for 3D Point Cloud Completion
  • [Arxiv] 3D Gated Recurrent Fusion for Semantic Scene Completion
  • [ICCVW2019] EdgeConnect: Structure Guided Image Inpainting using Edge Prediction [pytorch] 🔥⭐
  • [ICRA2020] Depth Based Semantic Scene Completion with Position Importance Aware Loss
  • [CVPR2020] SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
  • [Arxiv] PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
  • [ICLR2020] Unpaired Point Cloud Completion on Real Scans using Adversarial Training [tensorflow]
  • [AAAI2020] Morphing and Sampling Network for Dense Point Cloud Completion [pytorch]
  • [ICCVW2019] Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion
  • [ICCV2019] ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image [tensorflow]
  • [ICCV2019] Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion [Caffe3D]
  • [ICCV2019] Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction
  • [Arxiv] EdgeNet: Semantic Scene Completion from RGB-D images
  • [CVPR2019] TopNet: Structural Point Cloud Decoder [pytorch & tensorflow]
  • [CVPR2019] Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image
  • [CVPR2019] Leveraging Shape Completion for 3D Siamese Tracking [pytorch]
  • [CVPR2019] RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion [pytorch]
  • [3DV2018] PCN: Point Completion Network [tensorflow] 🔥
  • [ECCV2018] Efficient Semantic Scene Completion Network with Spatial Group Convolution [pytorch]
  • [CVPR2018] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [tensorflow] 🔥⭐
  • [CVPR2018] Learning 3D Shape Completion from Laser Scan Data with Weak Supervision [torch][torch]
  • [IJCV2018] Learning 3D Shape Completion under Weak Supervision [torch][torch]
  • [ICCV2017] High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference ⭐
  • [ICCV2017] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [torch] 🔥⭐
  • [CVPR2017] Semantic Scene Completion from a Single Depth Image [caffe] 🔥⭐
  • [CVPR2016] Structured Prediction of Unobserved Voxels From a Single Depth Image [resource] ⭐

Shape Reconstruction & Generation

  • [Arxiv] PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion [Project]
  • [Arxiv] 3D-aware Image Generation using 2D Diffusion Models [Project]
  • [Arxiv] HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion [Project]
  • [Arxiv] DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model [Project]
  • [Arxiv] Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior [Project]
  • [Arxiv] RealFusion: 360° Reconstruction of Any Object from a Single Image [Project]
  • [Arxiv] 3DGen: Triplane Latent Diffusion for Textured Mesh Generation
  • [Arxiv] Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation [Project]
  • [CVPR2023] Controllable Mesh Generation Through Sparse Latent Point Diffusion Models [Project]
  • [CVPR2023] NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images [Project]
  • [ICLR2023] MeshDiffusion: Score-based Generative 3D Mesh Modeling [Project]
  • [CVPR2023] PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision [Project]
  • [Arxiv] Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions [Project]
  • [CVPR2023] SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field [Project]
  • [Arxiv] NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
  • [Arxiv] Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement [Project]
  • [Arxiv] 3D generation on ImageNet [Project]
  • [Arxiv] Text-driven Visual Synthesis with Latent Diffusion Prior [Project]
  • [Arxiv] VQ3D: Learning a 3D-Aware Generative Model on ImageNet [Project]
  • [Arxiv] TEXTure: Text-Guided Texturing of 3D Shapes [Project]
  • [Arxiv] LEGO-Net: Learning Regular Rearrangements of Objects in Rooms [Project]
  • [Arxiv] DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis [Project]
  • [Arxiv] GeoCode: Interpretable Shape Programs [Project]
  • [Arxiv] Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models [Project]
  • [Arxiv] Point-E: A System for Generating 3D Point Clouds from Complex Prompts [Project]
  • [Arxiv] LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and Editing
  • [Arxiv] SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation [Project]
  • [Arxiv] NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
  • [Arxiv] Diffusion-SDF: Text-to-Shape via Voxelized Diffusion [Project]
  • [Arxiv] 3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
  • [Arxiv] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation [Project]
  • [Arxiv] SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction [Project]
  • [Arxiv] 3D Neural Field Generation using Triplane Diffusion [Project]
  • [Arxiv] Neural Volumetric Mesh Generator
  • [Arxiv] Tetrahedral Diffusion Models for 3D Shape Generation
  • [Arxiv] MagicPony: Learning Articulated 3D Animals in the Wild [Project]
  • [Arxiv] RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation [Project]
  • [Arxiv] Magic3D: High-Resolution Text-to-3D Content Creation [Project]
  • [Arxiv] Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
  • [NeurIPS2022] LION: Latent Point Diffusion Models for 3D Shape Generation [Project]
  • [NeurIPS2022] GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [Project]
  • [ECCV2022] Cross-Modal 3D Shape Generation and Manipulation [Project]
  • [ECCV2022] Deforming Radiance Fields with Cages
  • [NeurIPS2021] NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild [Project]
  • [CVPR2022] CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation [github]
  • [CVPR2022] Multi-View Mesh Reconstruction with Neural Deferred Shading [Project]
  • [Arxiv] Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera [Project]
  • [Arxiv] Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model
  • [Arxiv] 3DILG: Irregular Latent Grids for 3D Generative Modeling [Project]
  • [CVPR2022] FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction [Project]
  • [CVPR2022] Topologically-Aware Deformation Fields for Single-View 3D Reconstruction [Project]
  • [Arxiv] Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues [Project]
  • [Arxiv] Neural Vector Fields for Surface Representation and Inference
  • [CVPR2022] Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction [Project]
  • [CVPR2022] BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information [Project]
  • [CVPR2022] φ-SfT: Shape-from-Template with a Physics-Based Deformation Model [Project]
  • [CVPR2022] OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction [Project]
  • [Arxiv] Neural Dual Contouring
  • [Arxiv] POCO: Point Convolution for Surface Reconstruction [Project]
  • [ICCV2021] SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators [github]

Before 2022

  • [Arxiv] DoodleFormer: Creative Sketch Drawing with Transformers
  • [NeurIPS2021] Class-agnostic Reconstruction of Dynamic Objects from Videos [Project]
  • [Arxiv] The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts
  • [Arxiv] MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks [github]
  • [Arxiv] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers [github]
  • [Arxiv] JoinABLe: Learning Bottom-up Assembly of Parametric CAD Joints
  • [Arxiv] Image Based Reconstruction of Liquids from 2D Surface Detections
  • [Arxiv] TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function
  • [NeurIPS2021] Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis [Project]
  • [ICML2021] Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces [tensorflow]
  • [Arxiv] StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation [Project]
  • [3DV2021] High Fidelity 3D Reconstructions with Limited Physical Views [Project]
  • [3DV2021] Multi-Category Mesh Reconstruction From Image Collections [github]
  • [Arxiv] Style Agnostic 3D Reconstruction via Adversarial Style Transfer [https://github.com/Felix-Petersen/style-agnostic-3d-reconstruction]
  • [Arxiv] BANMo: Building Animatable 3D Neural Models from Many Casual Videos [Project]
  • [Arxiv] EditVAE: Unsupervised Part-Aware Controllable 3D Point Cloud Shape Generation
  • [Arxiv] Differentiable Stereopsis: Meshes from multiple views using differentiable rendering [Project]
  • [ICCV2021] Neural Strokes: Stylized Line Drawing of 3D Shapes
  • [ACMMM2021] Single Image 3D Object Estimation with Primitive Graph Networks
  • [Arxiv] Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
  • [Arxiv] ABO: Dataset and Benchmarks for Real-World 3D Object Understanding [Project]
  • [ICCV2021] Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction [github]
  • [Arxiv] Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images
  • [ICCV2021] Learning Signed Distance Field for Multi-view Surface Reconstruction
  • [Arxiv] Image2Lego: Customized LEGO Set Generation from Images
  • [ICCV2021] Unsupervised Learning of Fine Structure Generation for 3D Point Clouds by 2D Projection Matching [github]
  • [Arxiv] Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ Rendering from a Single Image
  • [Arxiv] DOVE: Learning Deformable 3D Objects by Watching Videos [Project]
  • [Arxiv] Active 3D Shape Reconstruction from Vision and Touch
  • [NeurIPS2020] 3D Shape Reconstruction from Vision and Touch [pytorch]
  • [Arxiv] LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction
  • [Arxiv] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects
  • [Arxiv] View Generalization for Single Image Textured 3D Models [Project]
  • [Arxiv] Shape As Points: A Differentiable Poisson Solver
  • [Arxiv] Neural Implicit 3D Shapes from Single Images with Spatial Patterns
  • [IJCAI2021] Spline Positional Encoding for Learning 3D Implicit Signed Distance Fields
  • [Arxiv] Z2P: Instant Rendering of Point Clouds
  • [CVPR2021] Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance
  • [CVPR2021] Birds of a Feather: Capturing Avian Shape Models from Images [Project]
  • [Arxiv] DeepCAD: A Deep Generative Network for Computer-Aided Design Models
  • [Arxiv] StrobeNet: Category-Level Multiview Reconstruction of Articulated Objects
  • [CVPR2021] Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches
  • [Arxiv] Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks
  • [IJCAI2021] PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery
  • [Arxiv] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
  • [CVPR2021] Shape and Material Capture at Home
  • [CVPR2021] StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision [Project]
  • [Arxiv] CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly
  • [CVPR2021] Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction [Project]
  • [CVPR2021] Online Learning of a Probabilistic and Adaptive Scene Representation
  • [CVPR2021] Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
  • [Arxiv] Sketch2Mesh: Reconstructing and Editing 3D Shapes from Sketches
  • [CVPR2021] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction [Project]
  • [Arxiv] PC2WF: 3D WIREFRAME RECONSTRUCTION FROM RAW POINT CLOUDS
  • [CVPR2021] Diffusion Probabilistic Models for 3D Point Cloud Generation [Project]
  • [Arxiv] ShaRF: Shape-conditioned Radiance Fields from a Single View [Project]
  • [Arxiv] Shelf-Supervised Mesh Prediction in the Wild
  • [Arxiv] HyperPocket: Generative Point Cloud Completion
  • [Arxiv] Im2Vec: Synthesizing Vector Graphics without Vector Supervision [resource]
  • [Arxiv] Secrets of 3D Implicit Object Shape Reconstruction in the Wild
  • [Arxiv] Joint Learning of 3D Shape Retrieval and Deformation
  • [Arxiv] Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Before 2021

  • [Arxiv] Learning Delaunay Surface Elements for Mesh Reconstruction
  • [Arxiv] Compositionally Generalizable 3D Structure Prediction
  • [Arxiv] Online Adaptation for Consistent Mesh Reconstruction in the Wild
  • [Arxiv] Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds
  • [Arxiv] Deep Optimized Priors for 3D Shape Modeling and Reconstruction
  • [Arxiv] DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS [Project]
  • [Arxiv] DUDE: Deep Unsigned Distance Embeddings for Hi-Fidelity Representation of Complex 3D Surfaces
  • [3DV2020] Learning to Infer Semantic Parameters for 3D Shape Editing [Project]
  • [3DV2020] Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [Project]
  • [3DV2020] A Divide et Impera Approach for 3D Shape Reconstruction from Multiple Views
  • [Arxiv] A Closed-Form Solution to Local Non-Rigid Structure-from-Motion
  • [Arxiv] Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
  • [Arxiv] D-NeRF: Neural Radiance Fields for Dynamic Scenes
  • [Arxiv] Modular Primitives for High-Performance Differentiable Rendering
  • [CVPR2021] NeuralFusion: Online Depth Fusion in Latent Space
  • [Arxiv] Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video [Project]
  • [NeurIPS2020] Continuous Object Representation Networks: Novel View Synthesis without Target View Supervision [Project]
  • [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
  • [NeurIPS2020] Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance [Project]
  • [NeurIPS2020] Convolutional Generation of Textured 3D Meshes [Project]
  • [Arxiv] Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos
  • [NeurIPS2020] UCLID-Net: Single View Reconstruction in Objec Space [Project]
  • [NeurIPS2020] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations [Project]
  • [NeurIPS2020] Generative 3D Part Assembly via Dynamic Graph Learning [pytorch]
  • [NeurIPS2020] Learning Deformable Tetrahedral Meshes for 3D Reconstruction [Project]
  • [NeurIPS2020] SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds [pytorch]
  • [Arxiv] Training Data Generating Networks: Linking 3D Shapes and Few-Shot Classification
  • [Arxiv] MESHMVS: MULTI-VIEW STEREO GUIDED MESH RECONSTRUCTION
  • [Arxiv] Learning Occupancy Function from Point Clouds for Surface Reconstruction
  • [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
  • [Arxiv] GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering [github]
  • [3DV2020] A Progressive Conditional Generative Adversarial Network for Generating Dense and Colored 3D Point Clouds
  • [3DV2020] Better Patch Stitching for Parametric Surface Reconstruction
  • [NeurIPS2020] Skeleton-bridged Point Completion: From Global Inference to Local Adjustment [Project Page]
  • [Arxiv] NeRF++: Analyzing and Improving Neural Radiance Fields [pytorch]
  • [Arxiv] Improved Modeling of 3D Shapes with Multi-view Depth Maps
  • [SIGGRAPH2020] One Shot 3D Photography [Project]
  • [BMVC2020] Large Scale Photometric Bundle Adjustment
  • [ECCV2020] Interactive Annotation of 3D Object Geometry using 2D Scribbles [Project]
  • [BMVC2020] Visibility-aware Multi-view Stereo Network
  • [ECCV2020] Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images
  • [ECCV2020] 3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View [Project][Pytorch]
  • [BMVC2020] 3D-GMNet: Single-View 3D Shape Recovery as A Gaussian Mixture
  • [SIGGRAPH2020] Self-Sampling for Neural Point Cloud Consolidation
  • [ECCV2020] Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction [github]
  • [Arxiv] NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections [Project]
  • [Arxiv] MeshODE: A Robust and Scalable Framework for Mesh Deformation
  • [Arxiv] MRGAN: Multi-Rooted 3D Shape Generation with Unsupervised Part Disentanglement
  • [ECCV2020] Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance [pytorch]
  • [ECCV2020] Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop
  • [ECCV2020] Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
  • [ECCV2020] Shape and Viewpoint without Keypoints
  • [Arxiv] Object-Centric Multi-View Aggregation
  • [ECCV2020] Points2Surf Learning Implicit Surfaces from Point Clouds
  • [NeurIPS2020] Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows [Project]
  • [Arxiv] Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images
  • [Arxiv] Neural Non-Rigid Tracking
  • [NeurIPS2020] MeshSDF: Differentiable Iso-Surface Extraction
  • [Arxiv] 3D Reconstruction of Novel Object Shapes from Single Images
  • [NeurIPS2020] ShapeFlow: Learnable Deformations Among 3D Shapes [pytorch]
  • [Arxiv] 3D Shape Reconstruction from Free-Hand Sketches
  • [Arxiv] Convolutional Occupancy Networks
  • [Siggraph2020] Point2Mesh: A Self-Prior for Deformable Meshes
  • [Arxiv] PointTriNet: Learned Triangulation of 3D Point
  • [Arxiv] A Simple and Scalable Shape Representation for 3D Reconstruction
  • [Siggraph2020] Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video
  • [CVPR2020] From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks [tensorflow]
  • [CVPR2020] Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes [github]
  • [Arxiv] PolyGen: An Autoregressive Generative Model of 3D Meshes
  • [Arxiv] Combinatorial 3D Shape Generation via Sequential Assembly
  • [Arxiv] Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors
  • [Arxiv] Neural Object Descriptors for Multi-View Shape Reconstruction
  • [CVPR2020] SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings [pytorch]
  • [Arxiv] Modeling 3D Shapes by Reinforcement Learning
  • [ECCV2020] ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds [pytorch]
  • [Arxiv] Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations
  • [Arxiv] Universal Differentiable Renderer for Implicit Neural Representations
  • [Arxiv] Learning 3D Part Assembly from a Single Image
  • [Arxiv] Curriculum DeepSDF
  • [Arxiv] PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
  • [Arxiv] Self-supervised Single-view 3D Reconstruction via Semantic Consistency
  • [Arxiv] Meta3D: Single-View 3D Object Reconstruction from Shape Priors in Memory
  • [Arxiv] STD-Net: Structure-preserving and Topology-adaptive Deformation Network for 3D Reconstruction from a Single Image [new]
  • [Arxiv] Curvature Regularized Surface Reconstruction from Point Cloud
  • [Arxiv] Hypernetwork approach to generating point clouds
  • [Arxiv] Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
  • [Arxiv] Meshlet Priors for 3D Mesh Reconstruction
  • [Arxiv] Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction
  • [Arxiv] SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
  • [CVPR2019] Occupancy Networks: Learning 3D Reconstruction in Function Space [pytorch] 🔥⭐
  • [NeurIPS2019] DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction [tensorflow]
  • [NeurIPS2019] Learning to Infer Implicit Surfaces without 3D Supervision
  • [CVPR2019] A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images [pytorch & tensorflow]
  • [Arxiv] Deep Level Sets: Implicit Surface Representations for 3D Shape Inference
  • [CVPR2019] Learning Implicit Fields for Generative Shape Modeling [tensorflow] 🔥
  • [ICCV2019] Point-based Multi-view Stereo Network [pytorch] ⭐
  • [Arxiv] TSRNet: Scalable 3D Surface Reconstruction Network for Point Clouds using Tangent Convolution
  • [Arxiv] DR-KFD: A Differentiable Visual Metric for 3D Shape Reconstruction
  • [ICCV2019] GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion
  • [ICCV2019] Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation [pytorch]
  • [ICCV2019] Few-Shot Generalization for Single-Image 3D Reconstruction via Priors
  • [ICCV2019] Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks
  • [AAAI2018] Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction [tensorflow] ⭐🔥
  • [NeurIPS2017] MarrNet: 3D Shape Reconstruction via 2.5D Sketches [torch]:star::fire:

3D Scene Understanding

  • [Arxiv] CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
  • [CVPR2023] Learning 3D Scene Priors with 2D Supervision [Project]
  • [CVPR2023] Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
  • [Arxiv] Decoupling Human and Camera Motion from Videos in the Wild [Project]
  • [CVPR2022] PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes [github]
  • [Arxiv] Semantic Instance Segmentation of 3D Scenes Through Weak Bounding Box Supervision [Project]
  • [CVPR2022] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation [github]
  • [CVPR2022] 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
  • [CVPR2022] BEHAVE: Dataset and Method for Tracking Human Object Interactions [Project]

Before 2022

  • [Arxiv] Transferable End-to-end Room Layout Estimation via Implicit Encoding [Project]
  • [Arxiv] ScanQA: 3D Question Answering for Spatial Scene Understanding
  • [Arxiv] 3D Question Answering
  • [Arxiv] MVLayoutNet:3D layout reconstruction with multi-view panoramas
  • [SGP2021] Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
  • [Arxiv] 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
  • [Arxiv] Pose2Room: Understanding 3D Scenes from Human Activities [Project]
  • [NeurIPS2021] SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency [Project]
  • [Arxiv] D3Net: A Speaker-Listener Architecture for Semi-supervised Dense Captioning and Visual Grounding in RGB-D Scans [Project]
  • [Arxiv] Recognizing Scenes from Novel Viewpoints
  • [Arxiv] Putting 3D Spatially Sparse Networks on a Diet
  • [Arxiv] Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing [github]
  • [NeurIPS2021] Neural Scene Flow Prior [github]
  • [ICCV2021] Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images [Project]
  • [Arxiv] RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View
  • [EMNLP2021] Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments [Project]
  • [Arxiv] KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D [Project]
  • [CVPR2021] OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets [github]
  • [Arxiv] Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck [github]
  • [TPAMI2021] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [github]
  • [Arxiv] PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds [github]
  • [Arxiv] Residual 3D Scene Flow Learning with Context-Aware Feature Extraction
  • [ICCV2021] Learning to Generate Scene Graph from Natural Language Supervision [github]
  • [ICCV2021] The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation [Project]
  • [ICCV2021] Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs
  • [ICCV2021] PICCOLO: Point Cloud-Centric Omnidirectional Localization
  • [ICCV2021] Unconditional Scene Graph Generation
  • [Arxiv] Learning Indoor Layouts from Simple Point-Clouds
  • [Arxiv] LanguageRefer: Spatial-Language Model for 3D Visual Grounding
  • [Arxiv] WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels
  • [CVPR2021] Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts [github]
  • [ICRA2021] Efficient and Robust LiDAR-Based End-to-End Navigation [Project]
  • [ICLR2021] VTNet: Visual Transformer Network for Object Goal Navigation
  • [CVPR2021] Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
  • [CVPR2021] HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
  • [Arxiv] FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting
  • [Arxiv] SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
  • [Arxiv] Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry? [Project]
  • [Arxiv] Pri3D: Can 3D Priors Help 2D Representation Learning?
  • [Arxiv] LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments
  • [CVPRW] OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas [github]
  • [Arxiv] Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image [pytorch]
  • [Arxiv] SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000× Fewer Labels [github]
  • [CVPR2021] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
  • [CVPR2021] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
  • [ICRA] Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments [Project]
  • [Arxiv] Contextual Scene Augmentation and Synthesis via GSACNet
  • [Arxiv] In-Place Scene Labelling and Understanding with Implicit Scene Representation
  • [CVPR2021] Bidirectional Projection Network for Cross Dimension Scene Understanding [github]
  • [Arxiv] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
  • [CVPR2021] Visual Room Rearrangement [Project]
  • [Arxiv] MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans
  • [Arxiv] Structured Scene Memory for Vision-Language Navigation
  • [Arxiv] House-GAN++: Generative Adversarial Layout Refinement Networks
  • [Arxiv] Weakly Supervised Learning of Rigid 3D Scene Flow
  • [ICLR2021] End-to-End Egospheric Spatial Memory
  • [Arxiv] Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas [Project]
  • [Arxiv] A modular vision language navigation and manipulation framework for long horizon compositional tasks in indoor environment
  • [Arxiv] Deep Reinforcement Learning for Producing Furniture Layout in Indoor Scenes
  • [Arxiv] Where2Act: From Pixels to Actions for Articulated 3D Objects [Project]

Before 2021

  • [Arxiv] PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things
  • [Arxiv] AI2-THOR: An Interactive 3D Environment for Visual AI [Project]
  • [Arxiv] Audio-Visual Floorplan Reconstruction
  • [Arxiv] PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
  • [Arxiv] RAFT-3D: Scene Flow using Rigid-Motion Embeddings
  • [Arxiv] GenScan: A Generative Method for Populating Parametric 3D Scan Datasets
  • [Arxiv] LayoutGMN: Neural Graph Matching for Structural Layout Similarity
  • [Arxiv] Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
  • [Arxiv] P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding
  • [Arxiv] Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
  • [Arxiv] Localising In Complex Scenes Using Balanced Adversarial Adaptation
  • [Arxiv] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
  • [NeurIPS2020] Multi-Plane Program Induction with 3D Box Priors [Project]
  • [Arxiv] HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
  • [Arxiv] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
  • [Arxiv] Generative Layout Modeling using Constraint Graphs
  • [NeurIPS2020] Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D [pytorch]
  • [NeurIPS2020] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
  • [NeurIPS2020W] Unsupervised Domain Adaptation for Visual Navigation
  • [Arxiv] Embodied Visual Navigation with Automatic Curriculum Learningin Real Environments
  • [Arxiv] 3D Room Layout Estimation Beyond the Manhattan World Assumption
  • [Arxiv] OpenBot: Turning Smartphones into Robots [Project]
  • [Arxiv] Audio-Visual Waypoints for Navigation
  • [Arxiv] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
  • [ECCV2020] Occupancy Anticipation for Efficient Exploration and Navigation [Project]
  • [Arxiv] Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
  • [Arxiv] Generating Person-Scene Interactions in 3D Scenes
  • [Arxiv] GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes
  • [ECCV2020] ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
  • [Arxiv] Structural Plan of Indoor Scenes with Personalized Preferences
  • [Arxiv] HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures [Project]
  • [CVPR2020] End-to-End Optimization of Scene Layout [Project]
  • [Arxiv] Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships
  • [CVPR2020] Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
  • [Arxiv] LayoutMP3D: Layout Annotation of Matterport3D
  • [CVPR2020] Local Implicit Grid Representations for 3D Scenes
  • [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
  • [CVPR2020] RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds [tensorflow] 🔥
  • [CVPR2020] Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only
  • [ICRA2020] 3DCFS: Fast and Robust Joint 3D Semantic-Instance Segmentation via Coupled Feature Selection
  • [Arxiv] Indoor Scene Recognition in 3D
  • [Journal] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
  • [Arxiv] BlockGAN Learning 3D Object-aware Scene Representations from Unlabelled Images
  • [Arxiv] 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans [Project] Related: [Arxiv] [Arxiv]
  • [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
  • [ICCV2019] UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images
  • [ICCV2019] Habitat: A Platform for Embodied AI Research [habitat-api] [habitat-sim] ⭐
  • [ICCV2019] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [project page] ⭐
  • [ICCV2019] Neural Inverse Rendering of an Indoor Scene From a Single Image
  • [ICCV2019] SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation [pytorch]
  • [ICCV2019] RIO: 3D Object Instance Re-Localization in Changing Indoor Environments [dataset]
  • [ICCV2019] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
  • [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
  • [NeurIPS2018] Learning to Exploit Stability for 3D Scene Parsing

3D Scene Reconstruction & Generation

  • [CVPR2023] Neuralangelo: High-Fidelity Neural Surface Reconstruction [Project]
  • [Arxiv] Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion [Project]
  • [Arxiv] FastSurf: Fast Neural RGB-D Surface Reconstruction using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning
  • [CVPR2023] I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs [Project]
  • [Arxiv] CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [Project]
  • [Arxiv] RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
  • [Arxiv] Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
  • [Arxiv] Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Project]
  • [Arxiv] Compositional 3D Scene Generation using Locally Conditioned Diffusion [Project]
  • [Arxiv] Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes [Project]
  • [BMVC2022] SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB image [github]
  • [Arxiv] NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM
  • [Arxiv] Text-To-4D Dynamic Scene Generation
  • [Arxiv] Behind the Scenes: Density Fields for Single View Reconstruction [Project]
  • [Arxiv] MIME: Human-Aware 3D Scene Generation [Project]
  • [CVPR2022] PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo
  • [CVPR2022] Neural 3D Scene Reconstruction with the Manhattan-world Assumption [Project]
  • [CVPR2022] 3D Scene Painting via Semantic Image Synthesis
  • [Siggraph2022] SNeRF: Stylized Neural Implicit Representations for 3D Scenes [Project]
  • [Siggraph2022] Neural 3D Reconstruction in the Wild [Project]
  • [Arxiv] GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction [Project]
  • [Arxiv] RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers
  • [Arxiv] iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [Project]
  • [Arxiv] NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors [Project]
  • [CVPR2022] PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos [Project]
  • [CVPR2022] Learning 3D Object Shape and Layout without 3D Supervision [Project]
  • [Arxiv] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction [Project]
  • [Arxiv] BlobGAN: Spatially Disentangled Scene Representations [Project]
  • [CVPR2022] NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction
  • [Arxiv] ATEK: Augmenting Transformers with Expert Knowledge for Indoor Layout Synthesis

Before 2022

  • [Arxiv] IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo [github]
  • [Arxiv] What's Behind the Couch? Directed Ray Distance Functions (DRDF) for 3D Scene Reconstruction [Project]
  • [Arxiv] Input-level Inductive Biases for 3D Reconstruction
  • [Arxiv] ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
  • [Arxiv] Multi-View Stereo with Transformer
  • [3DV2021] 3DVNet: Multi-View Depth Prediction and Volumetric Refinement
  • [Arxiv] VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View Selection and Fusion
  • [Arxiv] CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene
  • [Arxiv] Joint stereo 3D object detection and implicit surface reconstruction
  • [CoRL2021] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [Project]
  • [NeurIPS2021] Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image [Project]
  • [NeurIPS2021] Panoptic 3D Scene Reconstruction From a Single RGB Image
  • [Arxiv] NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [Project]
  • [BMVC2021] PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image [github]
  • [ICCV2021] Scene Synthesis via Uncertainty-Driven Attribute Synchronization [github]
  • [NeurIPS2021] ATISS: Autoregressive Transformers for Indoor Scene Synthesis [Project]
  • [ICCV2021] Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting
  • [Arxiv] Black-Box Test-Time Shape REFINEment for Single View 3D Reconstruction
  • [Arxiv] Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images
  • [ICCV2021] Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility [github]
  • [ICCV2021] 3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces [Project]
  • [ICCV2021] VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction
  • [Arxiv] AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network
  • [Arxiv] NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis
  • [ICCV2021] Out-of-Core Surface Reconstruction via Global $TGV$ Minimization
  • [ICCV2021] Discovering 3D Parts from Image Collections [Project]
  • [ICCV2021] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery [pytorch]
  • [Arxiv] TransformerFusion: Monocular RGB Scene Reconstruction using Transformers [Project]
  • [Arxiv] Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
  • [Arxiv] NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
  • [CVPR2021] Mirror3D: Depth Refinement for Mirror Surfaces [Project]
  • [CVPR2021] Plan2Scene: Converting Floorplans to 3D Scenes [Project]
  • [Arxiv] Translational Symmetry-Aware Facade Parsing for 3D Building Reconstruction
  • [Arxiv] Learning to Stylize Novel Views [Project]
  • [Arxiv] Stylizing 3D Scene via Implicit Representation and HyperNetwork
  • [CVPR2021] SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data [Project]
  • [Arxiv] The Boombox: Visual Reconstruction from Acoustic Vibrations [Project]
  • [Arxiv] Joint Pose and Shape Estimation of Vehicles from LiDAR Data
  • [CVPR2021] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video [Project]
  • [Arxiv] DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range [pytorch]
  • [Arxiv] Planar Surface Reconstruction from Sparse Views [Project]
  • [Arxiv] Neural RGB-D Surface Reconstruction
  • [Arxiv] RetrievalFuse: Neural 3D Scene Reconstruction with a Database
  • [ICCV2021] PlenOctrees for Real-time Rendering of Neural Radiance Fields [C++]
  • [Arxiv] iMAP: Implicit Mapping and Positioning in Real-Time
  • [CVPR2021] Monte Carlo Scene Search for 3D Scene Understanding
  • [CVPR2021] Holistic 3D Scene Understanding from a Single Image with Implicit Representation
  • [CVPR2021] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction [pytorch]
  • [Arxiv] IBRNet: Learning Multi-View Image-Based Rendering [Project]
  • [Arxiv] STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering [Project]

Before 2021

  • [ToG2018] Deep convolutional priors for indoor scene synthesis [github]
  • [Arxiv] MO-LTR: Multiple Object Localization, Tracking and Reconstruction from Monocular RGB Videos
  • [Arxiv] DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
  • [3DV2020] Scene Flow from Point Clouds with or without Learning
  • [Arxiv] Stable View Synthesis
  • [Arxiv] Neural Scene Graphs for Dynamic Scenes
  • [3DV2020] RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty [pytorch]
  • [Arxiv] FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
  • [Arxiv] MoNet: Motion-based Point Cloud Prediction Network
  • [Arxiv] MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
  • [Arxiv] Efficient Initial Pose-graph Generation for Global SfM
  • [Arxiv] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [Project]
  • [Arxiv] RGBD-Net: Predicting color and depth images for novel views synthesis
  • [Arxiv] SSCNav: Confidence-Aware Semantic Scene Completion for Visual Semantic Navigation [Project]
  • [Arxiv] From Points to Multi-Object 3D Reconstruction
  • [Arxiv] Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image [Project]
  • [Arxiv] SceneFormer: Indoor Scene Generation with Transformers [pytorch]
  • [NeurIPS2020] Neural Sparse Voxel Fields [Project]
  • [Arxiv] Towards Part-Based Understanding of RGB-D Scans
  • [Arxiv] Dynamic Plane Convolutional Occupancy Networks
  • [NeurIPS2020] Neural Unsigned Distance Fields for Implicit Function Learning [Project]
  • [Arxiv] Holistic static and animated 3D scene generation from diverse text descriptions [pytorch]
  • [Arxiv] Semi-Supervised Learning of Multi-Object 3D Scene Representations
  • [ECCV2020] CAD-Deform: Deformable Fitting of CAD Models to 3D Scans
  • [ECCV2020] Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
  • [ECCV2020] Learnable Cost Volume Using the Cayley Representation
  • [ECCV2020] Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
  • [ECCV2020] Convolutional Occupancy Networks
  • [CVPR2020] MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction
  • [ECCV2020] CoReNet: Coherent 3D scene reconstruction from a single RGB image
  • [CVPR2020] DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes
  • [ECCV2020] SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
  • [Arxiv] Removing Dynamic Objects for Static Scene Reconstruction using Light Fields
  • [Arxiv] Atlas: End-to-End 3D Scene Reconstruction from Posed Images
  • [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
  • [Arxiv] Plane Pair Matching for Efficient 3D View Registration
  • [CVPR2020] Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image [pytorch]
  • [Arxiv] Indoor Layout Estimation by 2D LiDAR and Camera Fusion
  • [Arxiv] General 3D Room Layout from a Single View by Render-and-Compare
  • [ICCV2019] Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
  • [CVPR2019] PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image [pytorch]:fire:
  • [ICCV2019] 3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers
  • [ICCV Workshop2019] Silhouette-Assisted 3D Object Instance Reconstruction from a Cluttered Scene
  • [ICCV2019] 3D-RelNet: Joint Object and Relation Network for 3D prediction [pytorch]
  • [3DV2019] Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network
  • [CVPR2018] Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene [pytorch]
  • [IROS2017] Indoor Scan2BIM: Building Information Models of House Interiors
  • [CVPR2017] 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions [github]

NeRF

  • [Arxiv] SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
  • [Arxiv] Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering [https://city-super.github.io/scaffold-gs/]
  • [NeurIPS2023] PyNeRF: Pyramidal Neural Radiance Fields
  • [Arxiv] K-Planes: Explicit Radiance Fields in Space, Time, and Appearance
  • [Arxiv] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
  • [ICCV2023] Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields [github]
  • [Arxiv] Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields [Project]
  • [CVPR2023] Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container [Project]
  • [Arxiv] CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
  • [Arxiv] LERF: Language Embedded Radiance Fields [Project]
  • [CVPR2023] Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervisio
  • [CVPR2023] HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization [github]
  • [Arxiv] BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis [Project]
  • [Arxiv] NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion [Project]
  • [Arxiv] HR-NeuS: Recovering High-Frequency Surface Geometry via Neural Implicit Surfaces
  • [Arxiv] 3D-aware Blending with Generative NeRFs [Project]
  • [Arxiv] Factor Fields: A Unified Framework for Neural Fields and Beyond
  • [Arxiv] Removing Objects From Neural Radiance Fields
  • [Arxiv] Interactive Segmentation of Radiance Fields [Project]
  • [Arxiv] Robust Dynamic Radiance Fields [Project]
  • [Arxiv] NeRF-Art: Text-Driven Neural Radiance Fields Stylization [Projetc]
  • [Arxiv] 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions [Project]
  • [Arxiv] EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
  • [Arxiv] SSDNeRF: Semantic Soft Decomposition of Neural Radiance Fields [Project]
  • [Arxiv] NeRFEditor: Differentiable Style Decomposition for Full 3D Scene Editing [Project]
  • [Arxiv] Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields [Project]
  • [WACV2023] ScanNeRF: a Scalable Benchmark for Neural Radiance Fields [Project]
  • [Arxiv] LaTeRF: Label and Text Driven Object Radiance Fields
  • [Arxiv] Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
  • [CVPR2022] RigNeRF: Fully Controllable Neural 3D Portraits [Project]
  • [Arxiv] Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
  • [Arxiv] D2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video [Project]
  • [Arxiv] Artemis: Articulated Neural Pets with Appearance and Motion synthesis [Project]
  • [Arxiv] KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints [Project]
  • [Arxiv] Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation
  • [Arxiv] PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis
  • [Arxiv] Block-NeRF: Scalable Large Scene Neural View Synthesis [Project]
  • [Arxiv] Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation
  • [Arxiv] NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes [Project]
  • [Arxiv] HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video [github]
  • [Arxiv] NeROIC: Neural Rendering of Objects from Online Image Collections [Projetc]
  • [Arxiv] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering
  • [Arxiv] InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [Project]

Before 2022

  • [Arxiv] Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs [Project]
  • [Arxiv] Light Field Neural Rendering [Project]
  • [Arxiv] CG-NeRF: Conditional Generative Neural Radiance Fields
  • [Arxiv] Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields [Project]
  • [Arxiv] MoFaNeRF: Morphable Facial Neural Radiance Field
  • [Arxiv] Dense Depth Priors for Neural Radiance Fields from Sparse Input Views
  • [Arxiv] NeRF-SR: High-Quality Neural Radiance Fields using Super-Sampling [Project]
  • [Arxiv] RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs [Project]
  • [Arxiv] NeRFReN: Neural Radiance Fields with Reflections [Project]
  • [Arxiv] NeuSample: Neural Sample Field for Efficient View Synthesis [Project]
  • [Arxiv] Urban Radiance Fields [Project]
  • [Arxiv] GeoNeRF: Generalizing NeRF with Geometry Priors [Project]
  • [Arxiv] NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images [Project]
  • [Arxiv] VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field [github]
  • [Arxiv] Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction [github]
  • [Arxiv] LOLNeRF: Learn from One Look
  • [Arxiv] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [Project]
  • [NeurIPS2021] Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [github]
  • [Arxiv] PERF: Performant, Explicit Radiance Fields
  • [Arxiv] Plenoxels: Radiance Fields without Neural Networks [Project]
  • [NeurIPS2021] Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering [Project]
  • [ICCV2021] CodeNeRF: Disentangled Neural Radiance Fields for Object Categories [github]
  • [ICCV2021] Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering [Project]
  • [ICCV2021] Differentiable Surface Rendering via Non-Differentiable Sampling
  • [ICCV2021] Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis [Project]
  • [Arxiv] Fast and Explicit Neural View Synthesis
  • [Arxiv] Depth-supervised NeRF: Fewer Views and Faster Training for Free [Project] [pytorch]
  • [Arxiv] A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields [Project]
  • [Arxiv] NeRF in detail: Learning to sample for view synthesis
  • [Arxiv] NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination [Project]
  • [Arxiv] Neural Trajectory Fields for Dynamic Novel View Synthesis
  • [Arxiv] Editing Conditional Radiance Fields [Project]
  • [CVPR2021] Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
  • [Arxiv] GNeRF: GAN-based Neural Radiance Field without Posed Camera
  • [Arxiv] BARF: Bundle-Adjusting Neural Radiance Fields [Project]
  • [Arxiv] MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo
  • [CVPR2021] Neural Lumigraph Rendering [Project]
  • [Arxiv] Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
  • [Arxiv] KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
  • [Arxiv] FastNeRF: High-Fidelity Neural Rendering at 200FPS
  • [CVPR2021] NeX: Real-time View Synthesis with Neural Basis Expansion [Project]
  • [Arxiv] DONeRF: Towards Real-Time Rendering of Neural Radiance Fields using Depth Oracle Networks [Project]
  • [Arxiv] NeRF--: Neural Radiance Fields Without Known Camera Parameters [Project]

Before 2021

  • [Arxiv] pixelNeRF: Neural Radiance Fields from One or Few Images [Project]
  • [Arxiv] NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis [Project]
  • [Arxiv] Neural Radiance Flow for 4D View Synthesis and Video Processing [Project]
  • [Arxiv] Deformable Neural Radiance Fields [Project]
  • [Arxiv] DeRF: Decomposed Radiance Fields
  • [Arxiv] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

About Human Body

  • [Arxiv] Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling [Project]
  • [Arxiv] D3GA - Drivable 3D Gaussian Avatars [Project]
  • [Arxiv] NPC: Neural Point Characters from Video [Project]
  • [Arxiv] Normal-guided Garment UV Prediction for Human Re-texturing
  • [Arxiv] Sketch2Cloth: Sketch-based 3D Garment Generation with Unsigned Distance Fields
  • [Arxiv] PointAvatar: Deformable Point-based Head Avatars from Videos [Project]
  • [Arxiv] PhoMoH: Implicit Photorealistic 3D Models of Human Heads
  • [Arxiv] 3DHumanGAN: Towards Photo-Realistic 3D-Aware Human Image Generation
  • [Arxiv] Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion [Project]
  • [Arxiv] Generating Holistic 3D Human Motion from Speech [Project]
  • [Arxiv] MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis [Project]
  • [Arxiv] RANA: Relightable Articulated Neural Avatars [Project]
  • [Arxiv] Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
  • [Arxiv] One-shot Implicit Animatable Avatars with Model-based Priors [Project]
  • [Arxiv] PhysDiff: Physics-Guided Human Motion Diffusion Model [Project]
  • [Arxiv] Instant Volumetric Head Avatars [Project]
  • [Arxiv] EVA3D: Compositional 3D Human Generation from 2D Image Collections [Project]
  • [ECCV2022] Compositional Human-Scene Interaction Synthesis with Semantic Control [Project]
  • [ECCV2022] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [Project]
  • [CVPR2022] Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing [Project]
  • [ECCV2022] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras [Project]
  • [CVPR2022] Capturing and Inferring Dense Full-Body Human-Scene Contact [Project]
  • [Arxiv] Realistic One-shot Mesh-based Head Avatars [Project]
  • [CVPR2022] SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis
  • [Arxiv] DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image [Project]
  • [CVPR2022] Structured Local Radiance Fields for Human Avatar Modeling
  • [CVPR2022] ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
  • [Arxiv] AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling [Project]

Before 2022

  • [Arxiv] The Wanderings of Odysseus in 3D Scenes [Project]
  • [Arxiv] Putting People in their Place: Monocular Regression of 3D People in Depth [github]
  • [Arxiv] Tracking People by Predicting 3D Appearance, Location & Pose [Project]
  • [Arxiv] Adversarial Parametric Pose Prior
  • [NeurIPS2021] Garment4D: Garment Reconstruction from Point Cloud Sequences [Project]
  • [Arxiv] MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image [github]
  • [Arxiv] Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors
  • [Arxiv] GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras [Project]
  • [3DV2021] LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies [Project]
  • [Arxiv] A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose
  • [Arxiv] MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation [github]
  • [Arxiv] Multi-Person 3D Motion Prediction with Multi-Range Transformers [Project]
  • [Arxiv] DD-NeRF: Double-Diffusion Neural Radiance Field as a Generalizable Implicit Body Representation
  • [Arxiv] Creating and Reenacting Controllable 3D Humans with Differentiable Rendering
  • [Arxiv] Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
  • [BMVC2021] AniFormer: Data-driven 3D Animation with Transformer [Project]
  • [ACMMM2021] VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds
  • [Arxiv] Playing for 3D Human Recovery [Project]
  • [ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering [Project]
  • [Arxiv] ICON: Implicit Clothed humans Obtained from Normals [github]
  • [ICCV2021] Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild [Project]
  • [Arxiv] SPEC: Seeing People in the Wild with an Estimated Camera [Project]
  • [NeurIPS2021] Tracking People with 3D Representations [github]
  • [Arxiv] A Skeleton-Driven Neural Occupancy Representation for Articulated Hands
  • [Arxiv] GraFormer: Graph Convolution Transformer for 3D Pose Estimation [github]
  • [ICCV2021] Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
  • [ICCV2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation [github]
  • [ICCV2021] 3D Human Texture Estimation from a Single Image with Transformers
  • [ICCV2021] DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension
  • [Arxiv] SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes [Project]
  • [ICCV2021] Probabilistic Modeling for Human Mesh Recovery [Project]
  • [ICCV2021] Unsupervised Dense Deformation Embedding Network for Template-Free Shape Correspondence
  • [ACMMM2021] DC-GNet: Deep Mesh Relation Capturing Graph Convolution Network for 3D Human Shape Reconstruction
  • [SiggraphAsia2019] Neural State Machine for Character-Scene Interactions [github]
  • [ICCV2021] Learning Motion Priors for 4D Human Body Capture in 3D Scenes [Project]
  • [Arxiv] Deep Virtual Markers for Articulated 3D Shapes
  • [ICCV2021] Gravity-Aware Monocular 3D Human-Object Reconstruction [Project]
  • [ICCV2021] Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for Single-view Garment Reconstruction
  • [Arxiv] D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [github]
  • [ICCV2021] Stochastic Scene-Aware Motion Prediction [Project] [github]
  • [ICCV2021] ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
  • [ICCV2021] EventHPE: Event-based 3D Human Pose and Shape Estimation
  • [ACMMM2021] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [github]
  • [ACMMM2021] Skeleton-Contrastive 3D Action Representation Learning [github]
  • [Arxiv] Learning Local Recurrent Models for Human Mesh Recovery
  • [Arxiv] H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [Project]
  • [Arxiv] Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds [github]
  • [Arxiv] MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images [Project]
  • [Arxiv] Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images
  • [Arxiv] THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers
  • [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse RGBD Sensors [Project]
  • [Arxiv] Bridge the Gap Between Model-based and Model-free Human Reconstruction
  • [Arxiv] Neural Actor: Neural Free-view Synthesis of Human Actors with Pose Control
  • [Arxiv] Scene-aware Generative Network for Human Motion Synthesis
  • [Arxiv] Human Motion Prediction Using Manifold-Aware Wasserstein GAN
  • [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors [Project]
  • [Arxiv] TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [Project]
  • [CVPR2021] We are More than Our Joints: Predicting how 3D Bodies Move [Project]
  • [CVPR2021] LEAP: Learning Articulated Occupancy of People [Project]
  • [Arxiv] 3DCrowdNet: 2D Human Pose-Guided 3D Crowd Human Pose and Shape Estimation in the Wild
  • [CVPR2021] SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements [Project]
  • [Arxiv] Action-Conditioned 3D Human Motion Synthesis with Transformer VAE [Project]
  • [Arxiv] Dynamic Surface Function Networks for Clothed Human Bodies [github]
  • [Arxiv] Neural Articulated Radiance Field [github]
  • [Arxiv] Mesh Graphormer
  • [CVPR2021] SimPoE: Simulated Character Control for 3D Human Pose Estimation [Project]
  • [Arxiv] TRAJEVAE - Controllable Human Motion Generation from Trajectories [Project]
  • [CVPR2021] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors [Project]
  • [CVPR2021] Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction [Project]
  • [CVPR2021] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction [github]
  • [Arxiv] Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
  • [Arxiv] 3D Human Pose Estimation with Spatial and Temporal Transformers [pytorch]
  • [CVPR2021] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
  • [Arxiv] DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer
  • [Arxiv] Aggregated Multi-GANs for Controlled 3D Human Motion Prediction [Project]
  • [AAAI] PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos
  • [Arxiv] NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
  • [CVPR2021] SMPLicit: Topology-aware Generative Model for Clothed People [Project]
  • [CVPR2021] HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation [pytorch]
  • [Arxiv] Single-Shot Motion Completion with Transformer [Project]
  • [EG2021] Walk2Map: Extracting Floor Plans from Indoor Walk Trajectories
  • [Arxiv] Forecasting Characteristic 3D Poses of Human Actions
  • [Arxiv] Capturing Detailed Deformations of Moving Human Bodies
  • [Arxiv] A-NeRF: Surface-free Human 3D Pose Refinement via Neural Rendering [Project]
  • [Arxiv] Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [Project]
  • [Arxiv] S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
  • [Arxiv] PandaNet : Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
  • [Arxiv] Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans [Project]
  • [Arxiv] Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory
  • [3DV2020] PLACE: Proximity Learning of Articulation and Contact in 3D Environments [Project]
  • [ICCV2019] Resolving 3D Human Pose Ambiguities with 3D Scene Constraints [Project]

Before 2021

  • [ICCV2021] Monocular, One-stage, Regression of Multiple 3D People [github]
  • [ECCV2020] History Repeats Itself: Human Motion Prediction via Motion Attention [pytorch]
  • [ECCV2020] 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning [Project]
  • [Arxiv] Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [Project]
  • [Arxiv] End-to-End Human Pose and Mesh Reconstruction with Transformers
  • [Arxiv] Human Mesh Recovery from Multiple Shots [Project]
  • [NeurIPS2020] 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data [Project]
  • [Arxiv] Holistic 3D Human and Scene Mesh Estimation from Single View Images
  • [Arxiv] Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
  • [Arxiv] Pose2Pose: 3D Positional Pose-Guided 3D Rotational Pose Prediction for Expressive 3D Human Pose and Mesh Estimation
  • [Arxiv] NeuralAnnot: Neural Annotator for in-the-wild Expressive 3D Human Pose and Mesh Training Sets
  • [Arxiv] 4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project]
  • [Arxiv] Populating 3D Scenes by Learning Human-Scene Interaction [Project]
  • [ECCV2020] Long-term Human Motion Prediction with Scene Context [Project]
  • [Arxiv] Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild [Project]
  • [Arxiv] ANR: Articulated Neural Rendering for Virtual Avatars
  • [Arxiv] Generating 3D People in Scenes without People [Project]
  • [ICCV2019] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense
  • [CVPR2019] Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments [Project]
  • [TOG2016] Pigraphs: learning interaction snapshots from observations [Project]

General Methods

  • [CVPR2023] Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers [github]
  • [Arxiv] HexPlane: A Fast Representation for Dynamic Scenes [Project]
  • [Arxiv] Joint Representation Learning for Text and 3D Point Cloud
  • [Arxiv] Ponder: Point Cloud Pre-training via Neural Rendering
  • [Arxiv] 3D Point Cloud Pre-training with Knowledge Distillation from 2D Images
  • [Arxiv] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? [Project]
  • [Arxiv] Attentive Mask CLIP
  • [Arxiv] Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds
  • [Arxiv] Frozen CLIP Model is Efficient Point Cloud Backbone
  • [Arxiv] Continuous diffusion for categorical data
  • [Arxiv] EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
  • [Arxiv] Neural Density-Distance Fields [Project]
  • [Arxiv] Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
  • [Arxiv] Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer [Project]
  • [Arxiv] Masked Surfel Prediction for Self-Supervised Point Cloud Learning [github]
  • [Arxiv] Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training [github]
  • [Arxiv] 3D-Aware Video Generation [Project]
  • [Arxiv] Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space [Project]
  • [Arxiv] Masked Frequency Modeling for Self-Supervised Visual Pre-Training [Project]
  • [Arxiv] GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds [Project]
  • [Arxiv] Diffusion Models for Video Prediction and Infilling [Project]
  • [Arxiv] MaskViT: Masked Visual Pre-Training for Video Prediction [Project]
  • [Arxiv] Random Walks for Adversarial Meshes
  • [ICLR2022] Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework [github]
  • [CVPR2022] Rethinking Semantic Segmentation: A Prototype View [github]
  • [Arxiv] How to Understand Masked Autoencoders
  • [ICLR2022] QuadTree Attention for Vision Transformers [github]
  • [Arxiv] Contrastive Neighborhood Alignment

Before 2022

  • [Arxiv] Domain Adaptation on Point Clouds via Geometry-Aware Implicits
  • [ICCV2021] Progressive Seed Generation Auto-encoder for Unsupervised Point Cloud Learning
  • [Arxiv] Variance-Aware Weight Initialization for Point Convolutional Neural Networks
  • [Arxiv] Learning to Detect Every Thing in an Open World [Project]
  • [Arxiv] Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [Project]
  • [Arxiv] CpT: Convolutional Point Transformer for 3D Point Cloud Processing
  • [Arxiv] Swin Transformer V2: Scaling Up Capacity and Resolution [github]
  • [Arxiv] TransMix: Attend to Mix for Vision Transformers [github]
  • [Arxiv] Self-supervised GAN Detector [github]
  • [NeurIPS2021] Residual Relaxation for Multi-view Representation Learning
  • [ICCV2021] Video Autoencoder: self-supervised disentanglement of static 3D structure and motion [Project]
  • [NeurIPS2021] SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization [Project]
  • [Arxiv] Efficient Geometry-aware 3D Generative Adversarial Networks [Project]
  • [Arxiv] Self-attention Does Not Need $O(n^2)$ Memory
  • [Arxiv] CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis
  • [Arxiv] PointMixer: MLP-Mixer for Point Cloud Understanding
  • [NeurIPS2021] Blending Anti-Aliasing into Vision Transformer
  • [ICCV2021] Learning Inner-Group Relations on Point Clouds
  • [Arxiv] Point-Voxel Transformer: An Efficient Approach To 3D Deep Learning
  • [Siggraph2021] SP-GAN: Sphere-Guided 3D Shape Generation and Manipulation [Project] [github]
  • [ICCV2021] GraphFPN: Graph Feature Pyramid Network for Object Detection
  • [Arxiv] CKConv: Learning Feature Voxelization for Point Cloud Analysis
  • [ICCV2021] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers [pytorch]
  • [Arxiv] Volume Rendering of Neural Implicit Surfaces
  • [CVPR2021] Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
  • [Arxiv] DeepMesh: Differentiable Iso-Surface Extraction
  • [Arxiv] Neural Marching Cubes
  • [Arxiv] Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields
  • [Arxiv] Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering
  • [ICML2021] Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline [pytorch]
  • [Arxiv] Deep Medial Fields
  • [Arxiv] Subdivision-Based Mesh Convolution Networks [Jittor]
  • [Arxiv] VA-GCN: A Vector Attention Graph Convolution Network for learning on Point Clouds [pytorch]
  • [Arxiv] Aggregating Nested Transformers
  • [Arxiv] Rethinking the Design Principles of Robust Vision Transformer [pytorch]
  • [Siggraph2021] Acorn: Adaptive Coordinate Networks for Neural Scene Representation
  • [Arxiv] Walk in the Cloud: Learning Curves for Point Clouds Shape Analysis [Project]
  • [Arxiv] Pay Attention to MLPs
  • [Arxiv] ResMLP: Feedforward networks for image classification with data-efficient training
  • [Arxiv] RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
  • [Arxiv] MLP-Mixer: An all-MLP Architecture for Vision
  • [Arxiv] Vector Neurons: A General Framework for SO(3)-Equivariant Networks
  • [CVPR2021] MongeNet: Efficient Sampler for Geometric Deep Learning [Project]
  • [Arxiv] Point Cloud Learning with Transformer
  • [Arxiv] Dual Transformer for Point Cloud Analysis
  • [Arxiv] AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
  • [Arxiv] Learning from 2D: Pixel-to-Point Knowledge Transfer for 3D Pretraining
  • [Arxiv] Field Convolutions for Surface CNNs
  • [Arxiv] Rethinking Spatial Dimensions of Vision Transformers [pytorch] 🔥
  • [CVPR2021] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds [pytorch]
  • [Arxiv] Concentric Spherical GNN for 3D Representation Learning
  • [Arxiv] High-Performance Large-Scale Image Recognition Without Normalization
  • [Arxiv] Generative Models as Distributions of Functions
  • [Arxiv] Point-set Distances for Learning Representations of 3D Point Clouds
  • [Arxiv] Compressed Object Detection
  • [Arxiv] A linearized framework and a new benchmark for model selection for fine-tuning
  • [Arxiv] The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions
  • [Arxiv] Self-Supervised Pretraining of 3D Features on any Point-Cloud [pytorch]
  • [3DV2020] Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Before 2021

  • [ICCV2019] Efficient Learning on Point Clouds with Basis Point Sets [pytorch]
  • [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
  • [Arxiv] Diffusion is All You Need for Learning on Surfaces
  • [Arxiv] SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization
  • [3DV2020] Rotation-Invariant Point Convolution With Multiple Equivariant Alignments
  • [Arxiv] One Point is All You Need: Directional Attention Point for Feature Learning
  • [Arxiv] PCT: Point Cloud Transformer
  • [Arxiv] Hausdorff Point Convolution with Geometric Priors
  • [Arxiv] MARNet: Multi-Abstraction Refinement Network for 3D Point Cloud Analysis [Github]
  • [Arxiv] Point Transformer
  • [Arxiv] Learning geometry-image representation for 3D point cloud generation
  • [Arxiv] Deeper or Wider Networks of Point Clouds with Self-attention?
  • [NeurIPS2020] Primal-Dual Mesh Convolutional Neural Networks [pytorch]
  • [NeurIPS2020] Rational neural networks [tensorflow]
  • [NeurIPS2020] Exchangeable Neural ODE for Set Modeling [Project]
  • [NeurIPS2020] SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks [Project]
  • [NeurIPS2020] NVAE: A Deep Hierarchical Variational Autoencoder [pytorch]
  • [NeurIPS2020] Implicit Graph Neural Networks [pytorch]
  • [NeurIPS2020] The Autoencoding Variational Autoencoder [pytorch]
  • [Arxiv] PointManifold: Using Manifold Learning for Point Cloud Classification
  • [Arxiv] RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
  • [Arxiv] Pre-Training by Completing Point Clouds [pytorch]
  • [NeurIPS2020] Rotation-Invariant Local-to-Global Representation Learning for 3D Point Cloud
  • [Arxiv] IF-Defense: 3D Adversarial Point Cloud Defense via Implicit Function based Restoration [pytorch]
  • [Arxiv] DV-ConvNet: Fully Convolutional Deep Learning on Point Clouds with Dynamic Voxelization and 3D Group Convolution
  • [Arxiv] Spatial Transformer Point Convolution
  • [Arxiv] Minimal Adversarial Examples for Deep Learning on 3D Point Clouds
  • [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
  • [ECCV2020] PointMixup: Augmentation for Point Clouds [Code]
  • [ECCV2020] DR-KFS: A Differentiable Visual Similarity Metric for 3D Shape Reconstruction
  • [Arxiv] Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination
  • [Arxiv] Global Context Aware Convolutions for 3D Point Cloud Understanding
  • [ECCV2020] Shape Adaptor: A Learnable Resizing Module [pytorch]
  • [ACMMM2020] Differentiable Manifold Reconstruction for Point Cloud Denoising [pytorch]
  • [ECCV2020] Discrete Point Flow Networks for Efficient Point Cloud Generation
  • [Siggraph2020] Neural Subdivision
  • [Arxiv] PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
  • [Arxiv] Accelerating 3D Deep Learning with PyTorch3D
  • [Arxiv] Natural Graph Networks
  • [ECCV2020] Progressive Point Cloud Deconvolution Generation Network [github]
  • [Arxiv] Point Set Voting for Partial Point Cloud Analysis
  • [Arxiv] PointMask: Towards Interpretable and Bias-Resilient Point Cloud Processing
  • [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
  • [Arxiv] A Closer Look at Local Aggregation Operators in Point Cloud Analysis [github]
  • [NeurIPS2020] Implicit Neural Representations with Periodic Activation Functions [pytorch] 🔥
  • [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
  • [Arxiv] Local-Area-Learning Network: Meaningful Local Areas for Efficient Point Cloud Analysis
  • [Arxiv] TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations
  • [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
  • [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
  • [Arxiv] MeshWalker: Deep Mesh Understanding by Random Walks
  • [Arxiv] MOPS-Net: A Matrix Optimization-driven Network for Task-Oriented 3D Point Cloud Downsampling
  • [Arxiv] DPDist : Comparing Point Clouds Using Deep Point Cloud Distance
  • [CVPR2020] PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
  • [AAAI2020] Shape-Oriented Convolution Neural Network for Point Cloud Analysis
  • [Arxiv] Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges
  • [Arxiv] LIGHTCONVPOINT: CONVOLUTION FOR POINTS [pytorch]
  • [Arxiv] Variational Auto-Decoder [pytorch]
  • [Arxiv] Generative PointNet: Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
  • [CVPR2020] DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes [pytorch]
  • [CVPR2020] RPM-Net: Robust Point Matching using Learned Features [github]
  • [CVPR2020] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
  • [CVPR2020] PointGMM: a Neural GMM Network for Point Clouds
  • [Arxiv] Dynamic ReLU
  • [CVPR2020] SampleNet: Differentiable Point Cloud Sampling [pytorch]
  • [Arxiv] Defense-PointNet: Protecting PointNet Against Adversarial Attacks
  • [CVPR2020] FPConv: Learning Local Flattening for Point Convolution [pytorch]
  • [SIGGRAPH2019] MeshCNN: A Network with an Edge [pytorch] 🔥⭐
  • [ICCV2019] Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning [tensorflow]
  • [ICCV2019] PU-GAN: a Point Cloud Upsampling Adversarial Network:fire:
  • [CVPR2019] Relation-Shape Convolutional Neural Network for Point Cloud Analysis [pytorch] 🔥
  • [CVPR2019] Patch-based Progressive 3D Point Set Upsampling [tensorflow] [pytorch] 🔥
  • [TOG2019] Dynamic Graph CNN for Learning on Point Clouds [Project] 🔥 ⭐
  • [ECCV2018] EC-Net: an Edge-aware Point set Consolidation Network [project page]
  • [CVPR2018] PU-Net: Point Cloud Upsampling Network ⭐🔥
  • [Arxiv] PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
  • [ICLR2017] DEEP LEARNING WITH SETS AND POINT CLOUDS
  • [NeurIPS2017] Deep Sets
  • [Siggraph2006] Designing with Distance Fields

Others (inc. Networks in Classification, Matching, Registration, Alignment, Depth, Normal, Pose, Keypoints, etc.)

  • [Arxiv] ConceptLab: Creative Generation using Diffusion Prior Constraints [Project]
  • [Arxiv] Fast Complementary Dynamics via Skinning Eigenmodes [Project]
  • [Arxiv] Visual Instruction Inversion: Image Editing via Visual Prompting [Project]
  • [Arxiv] Objaverse-XL: A Universe of 10M+ 3D Objects
  • [Arxiv] Temporally Consistent Online Depth Estimation Using Point-Based Fusion [Project]
  • [CVPR2023] Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [Project]
  • [Arxiv] Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models [github]
  • [Arxiv] Pix2Video: Video Editing using Image Diffusion [Project]
  • [Arxiv] Cross-domain Compositing with Pretrained Diffusion Models [Project]
  • [Arxiv] 3D-aware Conditional Image Synthesis [Project]
  • [CVPR2022] Focal Length and Object Pose Estimation via Render and Compare [github]
  • [CVPR2022] Kubric: A scalable dataset generator

Before 2022

  • [Arxiv] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation
  • [Arxiv] Toward Practical Self-Supervised Monocular Indoor Depth Estimation
  • [Arxiv] PartImageNet: A Large, High-Quality Dataset of Parts [github]
  • [Arxiv] AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions
  • [Arxiv] Benchmarking Detection Transfer Learning with Vision Transformers
  • [Arxiv] Panoptic Segmentation: A Review [github]
  • [NeurIPS2021] Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space [github]
  • [Arxiv] Attention Mechanisms in Computer Vision: A Survey
  • [Arxiv] Leveraging Geometry for Shape Estimation from a Single RGB Image [github]
  • [Arxiv] Deep Point Set Resampling via Gradient Fields [github]
  • [Arxiv] Efficient 3D Deep LiDAR Odometry [github]
  • [NeurIPS2021] 3DP3: 3D Scene Perception via Probabilistic Programming
  • [NeurIPS2021] CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration [github]
  • [BMVC2021] Cascading Feature Extraction for Fast Point Cloud Registration
  • [Arxiv] Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network
  • [BMVC2021] Multi-Stream Attention Learning for Monocular Vehicle Velocity and Inter-Vehicle Distance Estimation
  • [Arxiv] Occlusion-Robust Object Pose Estimation with Holistic Representation [github]
  • [BMVC2021] Depth-only Object Tracking
  • [3DV2021] Self-Supervised Monocular Scene Decomposition and Depth Estimation
  • [Arxiv] Deep Point Cloud Normal Estimation via Triplet Learning
  • [3DV2021] Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent Self-Supervised Monocular Depth Estimation
  • [CORL2021] LENS: Localization enhanced by NeRF synthesis
  • [3DV2021] PLNet: Plane and Line Priors for Unsupervised Indoor Depth Estimation [github]
  • [Arxiv] Unsupervised Pose-Aware Part Decomposition for 3D Articulated Objects
  • [ICCV2021] PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds [Project]
  • [ICCV2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation
  • [ICCV2021] StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
  • [IROS2021] KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation
  • [ICCV2021] Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation [github]
  • [Arxiv] Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation [Project]
  • [ICCV2021] Deep Hough Voting for Robust Global Registration
  • [Arxiv] You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors [Project]
  • [ICCV2021] A Robust Loss for Point Cloud Registration
  • [Arxiv] Geometry-Aware Self-Training for Unsupervised Domain Adaptationon Object Point Clouds
  • [IROS2021] Category-Level 6D Object Pose Estimation via Cascaded Relation and Recurrent Reconstruction Networks [Project] [github]
  • [ICCV2021] StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation [github]
  • [ICCV2021] SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
  • [ICCV2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation
  • [ICCV2021] AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds [Project]
  • [Arxiv] DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes
  • [ICCV2021] Towards Interpretable Deep Networks for Monocular Depth Estimation [github]
  • [Arxiv] UPDesc: Unsupervised Point Descriptor Learning for Robust Registration
  • [IROS2021] BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models [github]
  • [Arxiv] RigNet: Repetitive Image Guided Network for Depth Completion
  • [Arxiv] DCL: Differential Contrastive Learning for Geometry-Aware Depth Synthesis
  • [ACMMM2021] BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation [Project] [github]
  • [Arxiv] Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
  • [ICCV2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration [Project] [pytorch]
  • [Arxiv] Score-Based Point Cloud Denoising
  • [Arxiv] HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor
  • [Arxiv] Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes
  • [Arxiv] EdgeConv with Attention Module for Monocular Depth Estimation
  • [ICML2021] Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold [Project]
  • [ICRA2021] An Adaptive Framework For Learning Unsupervised Depth Completion [github] [github]
  • [ICRA2021] TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction [github]
  • [Siggraph2021] Orienting Point Clouds with Dipole Propagation
  • [CVPR2021] The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
  • [Arxiv] Fully Convolutional Line Parsing [pytorch]
  • [CVPR2021] Depth Completion using Plane-Residual Representation
  • [Arxiv] Domain Adaptive Monocular Depth Estimation With Semantic Information
  • [CVPR2021] Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries [github]
  • [Arxiv] Local Metrics for Multi-Object Tracking
  • [Arxiv] Full Surround Monodepth from Multiple Cameras
  • [CVPR2021] RGB-D Local Implicit Function for Depth Completion of Transparent Objects [Project]
  • [CVPR2021] Learning Camera Localization via Dense Scene Matching [pytorch]
  • [Arxiv] LSG-CPD: Coherent Point Drift with Local Surface Geometry for Point Cloud Registration
  • [ICRA2021] PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN
  • [Arxiv] Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
  • [CVPR2021] Skeleton Merger: an Unsupervised Aligned Keypoint Detector
  • [CVPR2021] Beyond Image to Depth: Improving Depth Prediction using Echoes
  • [CVPR2021] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [Project]
  • [CVPR2021] Self-supervised Geometric Perception
  • [Arxiv] StablePose: Learning 6D Object Poses from Geometrically Stable Patches
  • [Arxiv] A Parameterised Quantum Circuit Approach to Point Set Matching
  • [Arxiv] Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes
  • [Arxiv] Video Transformer Network
  • [ICLR2021] NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [pytorch]
  • [Arxiv] NBDT: NEURAL-BACKED DECISION TREE [pytorch]
  • [Arxiv] AdaBins: Depth Estimation using Adaptive Bins [pytorch]
  • [Arxiv] Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes
  • [Arxiv] CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds

Before 2021

  • [NeurIPS2019] PRNet: Self-Supervised Learning for Partial-to-Partial Registration [pytorch]
  • [Arxiv] iNeRF: Inverting Neural Radiance Fields for Pose Estimation [Project]
  • [Arxiv] Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion
  • [Arxiv] 3D Registration for Self-Occluded Objects in Context
  • [Arxiv] Continuous Surface Embeddings
  • [Arxiv] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
  • [Arxiv] MVTN: Multi-View Transformation Network for 3D Shape Recognition
  • [Arxiv] PREDATOR: Registration of 3D Point Clouds with Low Overlap
  • [Arxiv] Deep Magnification-Arbitrary Upsampling over 3D Point Clouds
  • [Arxiv] Occlusion Guided Scene Flow Estimation on 3D Point Clouds
  • [NeurIPS2020] An Analysis of SVD for Deep Rotation Estimation
  • [EG2020W] SHREC 2020 track: 6D object pose estimation
  • [ACCV2020] Best Buddies Registration for Point Clouds
  • [3DV] A New Distributional Ranking Loss With Uncertainty: Illustrated in Relative Depth Estimation
  • [BMVC2020] View-consistent 4D Light Field Depth Estimation
  • [BMVC2020] Neighbourhood-Insensitive Point Cloud Normal Estimation Network [Project]
  • [ECCV2020] DeepGMR: Learning Latent Gaussian Mixture Models for Registration [Project]
  • [ECCV2020] Motion Capture from Internet Videos [Project]
  • [ECCV2020] Depth Completion with RGB Prior
  • [ECCV2020] 6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference
  • [Arxiv] Self-Supervised Learning of Point Clouds via Orientation Estimation
  • [SIGGRAPH2020] SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images [Project]
  • [ECCV2020] Learning Stereo from Single Images [github]
  • [Arxiv] Learning Long-term Visual Dynamics with Region Proposal Interaction Networks [Project]
  • [ECCV2020] Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes [Project]
  • [ECCV2020] Unsupervised Shape and Pose Disentanglement for 3D Meshes
  • [Arxiv] PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network
  • [ECCV2020] P2Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
  • [CVPR2020] Learning multiview 3D point cloud registration [pytorch]
  • [CVPR2020] Feature-metric Registration: A Fast Semi-supervised Approach for Robust Point Cloud Registration without Correspondences
  • [Siggraph2020] Consistent Video Depth Estimation
  • [Arxiv] Deep Feature-preserving Normal Estimation for Point Cloud Filtering
  • [Arxiv] Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction
  • [CVPR2020] Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [pytorch]
  • [Arxiv] Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences
  • [Arxiv] Adversarial Texture Optimization from RGB-D Scans
  • [Arxiv] SAPIEN: A SimulAted Part-based Interactive ENvironment
  • [CVPR2020] G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
  • [Arxiv] On Localizing a Camera from a Single Image
  • [Arxiv] DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
  • [CVPR2020] KFNet: Learning Temporal Camera Relocalization using Kalman Filtering
  • [Arxiv] Neural Contours: Learning to Draw Lines from 3D Shapes
  • [Arxiv] 3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth and Single Color Image
  • [Arxiv] Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets
  • [CVPR2020] End-to-End Learning Local Multi-view Descriptors for 3D Point Clouds
  • [Arxiv] PnP-Net: A hybrid Perspective-n-Point Network
  • [CVPR2020] MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
  • [CVPR2020] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
  • [ICIP2020] TRIANGLE-NET: TOWARDS ROBUSTNESS IN POINT CLOUD CLASSIFICATION
  • [ICRA2020] Robust 6D Object Pose Estimation by Learning RGB-D Features
  • [Arxiv] Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
  • [Arxiv] Single Image Depth Estimation Trained via Depth from Defocus Cues [pytorch]
  • [Arxiv] DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling
  • [Arxiv] Target-less registration of point clouds: A review
  • [Arxiv] Quaternion Equivariant Capsule Networks for 3D point clouds
  • [Arxiv] Category-Level Articulated Object Pose Estimation
  • [Arxiv] A Quantum Computational Approach to Correspondence Problems on Point Sets
  • [Arxiv] DeepSFM: Structure From Motion Via Deep Bundle Adjustment
  • [Arxiv] P2GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation
  • [ICCV2019] Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
  • [ICCV2019] Joint Embedding of 3D Scan and CAD Objects [dataset]
  • [ICLR2019] BA-NET: DENSE BUNDLE ADJUSTMENT NETWORKS [tensorflow]
  • [ICCV2019] GP2C: Geometric Projection Parameter Consensus for Joint 3D Pose and Focal Length Estimation in the Wild
  • [ICCV2019] Closed-Form Optimal Two-View Triangulation Based on Angular Errors
  • [ICCV2019] Polarimetric Relative Pose Estimation
  • [ICCV2019] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans
  • [ICCV2019] Deep Non-Rigid Structure from Motion
  • [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
  • [Arxiv] Deep Interpretable Non-Rigid Structure from Motion [tensorflow]
  • [Arxiv] IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks [dataset]
  • [CVPR2019] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [pytorch] 🔥
  • [3DV2019] Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
  • [CVPR2016] Marr Revisited: 2D-3D Alignment via Surface Normal Prediction [caffe]

Survey, Resources and Tools

  • [Dataset] Aria Synthetic Environments Dataset
  • [Dataset] Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception [Project]
  • [Dataset] CAD-Estate: Large-scale CAD Model Annotation in RGB Videos [github]
  • [Arxiv] Teaching CLIP to Count to Ten
  • [Arxiv] ControlNet
  • [Arxiv] T2I-Adapter
  • [Arxiv] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation [Project]
  • [Arxiv] SDFStudio: A Unified Framework for Surface Reconstruction [Project]
  • [Arxiv] Objaverse: A Universe of Annotated 3D Objects [Project]
  • [Arxiv] Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild [Project]
  • [NeurIPS2021] ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data [github]
  • [Dataset] ReplicaCAD [Project]
  • [PhDthesis] Synthesizing Photorealistic Images with Deep Generative Learning
  • [ICCVW2021] V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous Driving [Project]
  • [Arxiv] TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and Grasping [Project]
  • [Arxiv] A Survey of Neural Trojan Attacks and Defenses in Deep Learning
  • [Arxiv] Tiny Object Tracking: A Large-scale Dataset and A Baseline [github]
  • [Arxiv] A survey of top-down approaches for human pose estimation
  • [Arxiv] A Survey on RGB-D Datasets
  • [Arxiv] Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

Before 2022

  • [Arxiv] iSeg3D: An Interactive 3D Shape Segmentation Tool
  • [Arxiv] Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD) [Project]
  • [Arxiv] PandaSet: Advanced Sensor Suite Dataset for Autonomous Driving [Project]
  • [Arxiv] Few-Shot Object Detection: A Survey
  • [Arxiv] Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping [Project]
  • [Arxiv] PyTorchVideo: A Deep Learning Library for Video Understanding [Project]
  • [Arxiv] DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes [Project]
  • [Arxiv] A Review on Human Pose Estimation
  • [ICCV2021] BuildingNet: Learning to Label 3D Buildings [Project]
  • [ICCV2021] Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans [Project]
  • [Arxiv] Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
  • [Arxiv] MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis [Project]
  • [Arxiv] UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator [Project]
  • [Arxiv] SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving [Project]
  • [Arxiv] A Survey on Human-aware Robot Navigation
  • [Arxiv] One Million Scenes for Autonomous Driving: ONCE Dataset [Project]
  • [Arxiv] 3D Object Detection for Autonomous Driving: A Survey
  • [Arxiv] The Oxford Road Boundaries Dataset
  • [CVPR2021] 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
  • [Arxiv] 3DB: A Framework for Debugging Computer Vision Models [github]
  • [Arxiv] NViSII: A Scriptable Tool for Photorealistic Image Generation [github]
  • [Dataset] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
  • [Survey] 3D Semantic Scene Completion: a Survey
  • [Survey] Deep Learning based 3D Segmentation: A Survey
  • [Survey] A comprehensive survey on point cloud registration
  • [Survey] Domain Generalization: A Survey
  • [Dataset] SUM: A Benchmark Dataset of Semantic Urban Meshes
  • [Survey] Attention Models for Point Clouds in Deep Learning: A Survey
  • [Benchmark] H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [Project]
  • [Survey] Dynamic Neural Networks: A Survey
  • [Survey] Online Continual Learning in Image Classification: An Empirical Survey
  • [Survey] Deep Learning for Visual Tracking: A Comprehensive Survey
  • [Survey] Occlusion Handling in Generic Object Detection: A Review
  • [Survey] Curriculum Learning: A Survey
  • [Github] Awesome Neural Radiance Fields
  • [Survey] Neural Volume Rendering: NeRF And Beyond
  • [Survey] Transformers in Vision: A Survey
  • [Survey] Efficient Transformers: A Survey
  • [Survey] Semantics for Robotic Mapping, Perception and Interaction: A Survey
  • [Survey] Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Before 2021

  • [Dataset] The Replica Dataset: A Digital Replica of Indoor Spaces [github]
  • [IROS2021] iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes [Project]
  • [Dataset] Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations [Github]
  • [Survey] Skeleton-based Approaches based on Machine Vision: A Survey
  • [Survey] Deep Learning-Based Human Pose Estimation: A Survey [Github]
  • [Dataset] Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding [Github]
  • [Survey] A Review and Comparative Study on Probabilistic Object Detection in Autonomous Driving [Github]
  • [Dataset] RELLIS-3D Dataset: Data, Benchmarks and Analysis [Github]
  • [Arxiv] Motion Prediction on Self-driving Cars: A Review
  • [Github] TESSE: Unity-based simulator to enable research in perception, mapping, learning, and robotics
  • [Survey] A Survey on Visual Transformer
  • [Survey] A Survey on Contrastive Self-supervised Learning
  • [Survey] A Survey of Surface Reconstruction from Point Clouds
  • [Dataset] Torch-Points3D: A Modular Multi-Task Framework for Reproducible Deep Learning on 3D Point Clouds [Project]
  • [Thesis] Learning to Reconstruct and Segment 3D Objects
  • [Survey] An Overview Of 3D Object Detection
  • [Survey] A Brief Review of Domain Adaptation
  • [Dataset] Announcing the Objectron Dataset
  • [Tutorial] Video Action Understanding: A Tutorial
  • [Arxiv] Fusion 360 Gallery: A Dataset and Environment for Programmatic CAD Reconstruction [Page]
  • [Survey] Multi-Task Learning with Deep Neural Networks: A Survey
  • [Survey] Deep Learning for 3D Point Cloud Understanding: A Survey
  • [Thesis] COMPUTATIONAL ANALYSIS OF DEFORMABLE MANIFOLDS: FROM GEOMETRIC MODELING TO DEEP LEARNING
  • [Arxiv] F*: An Interpretable Transformation of the F-measure
  • [Dataset] Gibson Database of 3D Spaces
  • [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
  • [Arxiv] PyTorch Metric Learning
  • [Arxiv] RGB-D Salient Object Detection: A Survey [Project]
  • [Arxiv] AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [Project]
  • [CVPR2020] OASIS: A Large-Scale Dataset for Single Image 3D in the Wild [Project]
  • [Arxiv] 3D-FUTURE: 3D FUrniture shape with TextURE
  • [Arxiv] 3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics [Project][Link]
  • [Arxiv] Differentiable Rendering: A Survey
  • [Arxiv] Visual Relationship Detection using Scene Graphs: A Survey
  • [Arxiv] Polarization Human Shape and Pose Dataset
  • [Arxiv] IDDA: a large-scale multi-domain dataset for autonomous driving [Project page]
  • [CVPR2020] RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [Project page]
  • [EG2020] State of the Art on Neural Rendering
  • [IJCAI-PRICAI2020] 3D-FUTURE: 3D FUrniture shape with TextURE
  • [Arxiv] Toronto-3D: A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways
  • [Arxiv] KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
  • [Arxiv] A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications
  • [Arxiv] From Seeing to Moving: A Survey on Learning for Visual Indoor Navigation (VIN)
  • [Arxiv] DIODE: A Dense Indoor and Outdoor DEpth Dataset [dataset]
  • [Github] Various GANs with Pytorch.
  • [Arxiv] SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances [dataset]
  • [CVM] A Survey on Deep Geometry Learning: From a Representation Perspective
  • [Arxiv] A survey on Semi-, Self- and Unsupervised Techniques in Image Classification
  • [Arxiv] fastai: A Layered API for Deep Learning
  • [Arxiv] AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude Traffic Surveillance [dataset]
  • [Arxiv] VIRTUAL KITTI 2 [dataset]
  • [Arxiv] Tutorial on Variational Autoencoders
  • [Arxiv] Review: deep learning on 3D point clouds
  • [Arxiv] Image Segmentation Using Deep Learning: A Survey
  • [CVPR2018] Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction
  • [Arxiv] Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey
  • [Arxiv] MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection
  • [Arxiv] Deep Learning for 3D Point Clouds: A Survey
  • [Arxiv] A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
  • [Arxiv] A Survey on Deep Learning Architectures for Image-based Depth Reconstruction
  • [Arxiv] secml: A Python Library for Secure and Explainable Machine Learning
  • [Arxiv] Bundle Adjustment Revisited
  • [ICCV2019] Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement
  • [Arxiv] SIFT Meets CNN: A Decade Survey of Instance Retrieval
  • [ICCV2019] Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data [tensorflow]
  • [Arxiv] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks [dataset]
  • [Arxiv] Imbalance Problems in Object Detection: A Review [repository]
  • [IJCV] Deep Learning for Generic Object Detection: A Survey
  • [Arxiv] Differentiable Visual Computing (Ph.D thesis)
  • [BMVC2018] InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset [dataset]
  • [ICCV2017] The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes [dataset] [script] ⭐
  • [Arxiv] SynthCity: A large scale synthetic point cloud [dataset]
  • [Github] Mesh Voxelization (SDFs or Occupancy grids)
  • [Github] SDFGen (to generate grid-based signed distance field (level set))
  • [Github] Blender renderer for python
  • [Github] Blender renderer for python
  • [Github] Volumetric TSDF Fusion of RGB-D Images in Python
  • [Github] Volumetric TSDF Fusion of Multiple Depth Maps
  • [Github] PyFusion
  • [Github] PyRender
  • [Github] PyMCubes
  • [Github] Watertight and Simplified Meshes through TSDF Fusion (Python tool for obtaining watertight meshes using TSDF fusion.)
  • [Github] Several tools about SDF functions.
  • [Github] 3DMatch Toolbox
  • [stackoverflow] Computing truncated signed distance function(TSDF) from a point cloud
  • [Github] voxblox: A library for flexible voxel-based mapping, mainly focusing on truncated and Euclidean signed distance fields.
  • [Github] Discregrid: A static C++ library for the generation of discrete functions on a box-shaped domain. This is especially suited for the generation of signed distance fields.
  • [Github] awesome-voxel: Voxel resources for coders
  • [Github] gvdb-voxels: Sparse volume compute and rendering on NVIDIA GPUs
  • [Github] pyntcloud is a Python library for working with 3D point clouds.
  • [Github] Open3D: A Modern Library for 3D Data Processing
  • [Github] mesh_to_sdf: Calculate signed distance fields for arbitrary meshes
  • [Github] Detecting & Penalizing Mesh Intersections
  • [CVPR2021] Picasso: A CUDA-based Library for Deep Learning over 3D Meshes [Github]
  • [Github] A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications
  • [Arxiv] Shuffler: A Large Scale Data Management Tool for Machine Learning in Computer Vision
  • [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
  • [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
  • [ICRA2014] A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM [Project]
  • [CVPR2016] SceneNet: Understanding Real World Indoor Scenes With Synthetic Data [Project]

About

A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).


Languages

Language:Python 100.0%