WangYongli's starred repositories

CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Language:PythonLicense:Apache-2.0Stargazers:1795Issues:0Issues:0

Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:3027Issues:0Issues:0

HybridDepth

Official implementation for HybridDepth Model

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:47Issues:0Issues:0
Language:PythonLicense:BSD-2-ClauseStargazers:7Issues:0Issues:0

depth-fm

DepthFM: Fast Monocular Depth Estimation with Flow Matching

Language:Jupyter NotebookLicense:MITStargazers:345Issues:0Issues:0

easy_handeye

Automated, hardware-independent Hand-Eye Calibration

Language:PythonLicense:NOASSERTIONStargazers:821Issues:0Issues:0

apriltags-cpp

C++ port of the AprilTags library, using OpenCV (and optionally, CGAL)

Language:C++Stargazers:111Issues:0Issues:0

GenerateU

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

Language:PythonStargazers:122Issues:0Issues:0

SwissArmyTransformer

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Language:PythonLicense:Apache-2.0Stargazers:928Issues:0Issues:0

groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:PythonStargazers:725Issues:0Issues:0

PatchFusion

[CVPR 2024] An End-to-End Tile-Based Framework for High-Resolution Monocular Metric Depth Estimation

Language:PythonLicense:MITStargazers:934Issues:0Issues:0

EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2038Issues:0Issues:0

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonLicense:GPL-3.0Stargazers:8738Issues:0Issues:0
Language:PythonStargazers:19Issues:0Issues:0

Metric3D

The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."

Language:PythonLicense:BSD-2-ClauseStargazers:1203Issues:0Issues:0

GasMono

Code for GasMono, accepted by ICCV 2023

Language:PythonStargazers:37Issues:0Issues:0

sc_depth_pl

SC-Depth (V1, V2, and V3) for Unsupervised Monocular Depth Estimation Webpage:https://jiawangbian.github.io/sc_depth_pl/

Language:PythonLicense:GPL-3.0Stargazers:415Issues:0Issues:0

lang-segment-anything

SAM with text prompt

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1482Issues:0Issues:0

EVA

EVA Series: Visual Representation Fantasies from BAAI

Language:PythonLicense:MITStargazers:2183Issues:0Issues:0

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5790Issues:0Issues:0

APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language:PythonLicense:Apache-2.0Stargazers:469Issues:0Issues:0

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:6657Issues:0Issues:0

Lite-Mono

[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

Language:PythonLicense:MITStargazers:517Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:544Issues:0Issues:0

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

Language:PythonLicense:MITStargazers:4341Issues:0Issues:0
Language:PythonLicense:MITStargazers:205Issues:0Issues:0

OMG-Seg

OMG-LLaVA and OMG-Seg codebase

Language:PythonLicense:NOASSERTIONStargazers:1190Issues:0Issues:0

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonLicense:Apache-2.0Stargazers:1592Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:25183Issues:0Issues:0

GLEE

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Language:PythonLicense:MITStargazers:1006Issues:0Issues:0