daoyuan98

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Language:Python653 29 58

unified-io-2

Language:PythonApache-2.0534 15 16

SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Language:PythonNOASSERTION516 14 42

CLIP-SAM

Experiment on combining CLIP with SAM to do open-vocabulary image segmentation.

Language:Jupyter Notebook317 6 5

diffusion-rig

Code Release for DiffusionRig (CVPR 2023)

Language:PythonNOASSERTION251 13 11

LRV-Instruction

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Language:PythonBSD-3-Clause231 11 22

RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Language:Python190 2 18

Battle-of-the-Backbones

186 5 2

ViP-LLaVA

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Language:PythonApache-2.0148 7 11

ViT-Lens

[CVPR 2024] ViT-Lens: Towards Omni-modal Representations

Language:PythonNOASSERTION138 9 13

pvic

Official PyTorch implementation for ICCV2023 paper "Exploring Predicate Visual Context in Detecting Human-Object Interactions"

Language:PythonBSD-3-Clause57 2 13

Parameterized-AP-Loss

Language:PythonApache-2.050 20

Efficient-LLM-Survey

The Efficiency Spectrum of LLM

50 40

MMVP-motion-matrix-based-video-prediction

This is the official repo of MMVP: motion-matrix-based video prediction (ICCV 2023)

Language:PythonMIT3200

Skeleton-in-Context

[CVPR2024] Official implementation of the paper: Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning

Language:Python27 3 1

Diag-HOI

Language:Python21 20

Symbol-LLM

Code for NeurIPS2023 Paper "Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning"

Language:Python18 10

ADA-CM

Language:Jupyter Notebook16 2 9

PELA

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation [CVPR 2024]

Language:PythonApache-2.09 3 1

CaesarNeRF

This repo is for CaesarNeRF: Calibrated Semantic Representation for Few-Shot Generalizable Neural Rendering.

Language:Python7 40