matrixgame2018

MingJian.L's starred repositories

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonGPL-3.08845 56 513

GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonApache-2.06286 40 296

AppAgent

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Language:PythonMIT4864 64 82

lagent

A lightweight framework for building LLM-based agents

Language:PythonApache-2.01749 17 62

OMG-Seg

OMG-LLaVA and OMG-Seg codebase

Language:PythonNOASSERTION1226 23 44

minisora

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Language:PythonApache-2.01163 18 63

Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

MIT981 37 4

SAM-Med2D

Official implementation of SAM-Med2D

Language:Jupyter NotebookApache-2.0834 13 66

DriveLM

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

Language:HTMLApache-2.0798 19 80

LanguageBind

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Language:PythonMIT683 15 57

Vary-toy

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Language:Python587 13 35

DriveAGI

[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System

Language:PythonApache-2.0537 27 7

GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Language:PythonNOASSERTION496 8 47

Awesome-Text-to-3D

A growing curation of Text-to-3D, Diffusion-to-3D works.

Language:TeX462 25 4

CaFo

[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Language:PythonMIT343 12 12

allenact

An open source framework for research in Embodied-AI from AI2.

Language:PythonNOASSERTION314 10 91

RGBX_Semantic_Segmentation

Language:PythonMIT296 8 47

Drive-WM

[CVPR 2024] A world model for autonomous driving.

Language:PythonApache-2.0282 22 5

DriveDreamer

[ECCV 2024] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

282 25 5

InternEvo

InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.

Language:PythonApache-2.0277 8 78

MedFM

Official Repository of NeurIPS 2023 - MedFM Challenge

Language:Python258 2 7

MMIF-DDFM

[ICCV 2023 Oral] Official implementation for "DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion."

Language:Python185 3 28

SegMiF

ICCV2023 | Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and Segmentation

Language:Python102 3 22

OmniScient-Model

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

Language:Jupyter NotebookApache-2.088 10 4

distill-bev

DistillBEV: Boosting Multi-Camera 3D Object Detection with Cross-Modal Knowledge Distillation (ICCV 2023)

Language:Python82 5 13

endosurf

Language:PythonMIT47 1 9

Lifelong-MonoDepth

About official Pytorch implementation of "Lifelong-MonoDepth: Lifelong Learning for Multi-Domain Monocular Metric Depth Estimation

Language:PythonMIT1000

Active_room_segmentation

Code for Human cognition-inspired active room segmentation

Language:PythonMIT8 1 1

songzhuoran.github.io

My homepage.

Language:HTMLMIT200

MedFCMEA

NeurIPS 2023 - Challenge / NeurIPS 2024 Dataset Track

Language:Python1 30