gaopengpjlab's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45034Issues:298Issues:648

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5585Issues:78Issues:141

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:2589Issues:36Issues:131

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1526Issues:27Issues:46

OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Language:PythonLicense:MITStargazers:606Issues:17Issues:68

ConvMAE

ConvMAE: Masked Convolution Meets Masked Autoencoders

Language:PythonLicense:MITStargazers:466Issues:11Issues:36

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

CaFo

[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Language:PythonLicense:MITStargazers:329Issues:12Issues:12

MonoDETR

[ICCV 2023] The first DETR model for monocular 3D object detection with depth-guided transformer

PointCLIP

[CVPR 2022] PointCLIP: Point Cloud Understanding by CLIP

Stable-Pix2Seq

A full-fledged version of Pix2Seq

Language:PythonLicense:Apache-2.0Stargazers:234Issues:7Issues:19

PointCLIP_V2

[ICCV 2023] PointCLIP V2: Prompting CLIP and GPT for Powerful 3D Open-world Learning

Language:PythonLicense:MITStargazers:207Issues:10Issues:27

I2P-MAE

[CVPR 2023] Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Point-M2AE

[NeurIPS 2022] Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

Language:PythonLicense:MITStargazers:193Issues:11Issues:14

llama-mps

Experimental fork of Facebooks LLaMa model which runs it with GPU acceleration on Apple Silicon M1/M2

Language:PythonLicense:GPL-3.0Stargazers:84Issues:3Issues:0

Q-ViT

The official implementation of the NeurIPS 2022 paper Q-ViT.

maskalign

[CVPR 2023] Official repository for paper "Stare at What You See: Masked Image Modeling without Reconstruction"

Language:PythonLicense:Apache-2.0Stargazers:62Issues:5Issues:3

MMT-Bench

ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

ProCA

[ECCV 2022] Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation

Language:PythonLicense:MITStargazers:51Issues:2Issues:8

FeatAug-DETR

Official repository of paper: "FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature Augmentation"

svl_adapter

SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models

MonoDETR-MV

The multi-view version of MonoDETR on nuScenes dataset

Language:PythonLicense:Apache-2.0Stargazers:15Issues:1Issues:0

DMJD

PyTorch implementation of Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling

Language:PythonLicense:MITStargazers:10Issues:2Issues:1