HanzhiC

Hanzhi Chen's starred repositories

segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.09971 61 194

fiftyone

The open-source tool for building high-quality datasets and computer vision models

Language:PythonApache-2.08018 55 1496

mPLUG-Owl

mPLUG-Owl: The Powerful Multi-modal Large Language Model Family

Language:PythonMIT2186 30 217

RT-DETR

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥

Language:PythonApache-2.02127 25 397

glomap

GLOMAP - Global Structured-from-Motion Revisited

Language:C++BSD-3-Clause1174 22 40

MultiDiffusion

Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)

Language:Jupyter Notebook967 36 25

mast3r

Grounding Image Matching in 3D with MASt3R

Language:PythonNOASSERTION682 21 32

dobb-e

Dobb·E: An open-source, general framework for learning household robotic manipulation

Language:G-codeMIT558 15 7

MOFA-Video

[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.

Language:PythonNOASSERTION555 24 45

TeleVision

Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

Language:PythonNOASSERTION537 7 20

sjc

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation (CVPR 2023)

Language:PythonNOASSERTION502 20 29

Awesome-Robotics-3D

A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites

423 10 1

vlmaps

[ICRA2023] Implementation of Visual Language Maps for Robot Navigation

Language:PythonMIT338 11 53

3D-VLA

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Language:Python302 18 6

transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

MIT17600

SceneVerse

Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"

Language:PythonMIT158 11 21

HOV-SG

[RSS2024] Official implementation of "Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot Navigation"

Language:PythonMIT138 2 15

FiT3D

[ECCV 2024] Improving 2D Feature Representations by 3D-Aware Fine-Tuning

Language:Jupyter NotebookMIT10100

GAPartNet

[CVPR 2023 Highlight] GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts.

Language:Jupyter Notebook92 4 15

RLAfford

RLAfford: End-to-End Affordance Learning for Robotic Manipulation, ICRA 2023

Language:Python86 2 16

DragAPart

[ECCV 2024] Official Implementation of DragAPart: Learning a Part-Level Motion Prior for Articulated Objects.

Language:Python58 4 1

Track-2-Act

code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation

Language:PythonNOASSERTION50 1 3

real2code

Language:Python4700

omninocs

A large-scale NOCS dataset.

Language:Jupyter NotebookApache-2.045 80

osam

Get up and running with SAM, EfficientSAM, YOLO-World, and other promptable vision models locally.

Language:PythonMIT41 40

RAM_code

Official implementation of RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation

Language:PythonNOASSERTION23 2 2

BEVInstructor

[ECCV24] Navigation Instruction Generation with BEV Perception and Large Language Models

2100

neural-isometries

Official JAX implementation of neural isometries - taming transformations for equivariant ML

Language:PythonNOASSERTION2000

yolo-world-onnx

ONNX models of YOLO-World (an open-vocabulary object detection).

Language:PythonGPL-3.012 2 1

DiffCAD

Language:PythonCC0-1.01000