bobzhang123's starred repositories

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:14978Issues:103Issues:956

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14140Issues:116Issues:373

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11597Issues:98Issues:402

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonLicense:Apache-2.0Stargazers:7367Issues:99Issues:1471

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:6402Issues:49Issues:196

Awesome-Pruning

A curated list of neural network pruning resources.

occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"

Language:PythonLicense:MITStargazers:1479Issues:33Issues:127

mmrazor

OpenMMLab Model Compression Toolbox and Benchmark.

Language:PythonLicense:Apache-2.0Stargazers:1412Issues:20Issues:267

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonLicense:NOASSERTIONStargazers:1263Issues:16Issues:118

Semi-supervised-learning

A Unified Semi-Supervised Learning Codebase (NeurIPS'22)

Language:PythonLicense:MITStargazers:1263Issues:21Issues:155

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1114Issues:29Issues:127

SimMIM

This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Language:PythonLicense:MITStargazers:890Issues:22Issues:41

PETR

[ECCV2022] PETR: Position Embedding Transformation for Multi-View 3D Object Detection & [ICCV2023] PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

Language:PythonLicense:NOASSERTIONStargazers:823Issues:15Issues:156

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

License:MITStargazers:675Issues:34Issues:0

ibot

iBOT :robot:: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:641Issues:6Issues:35

UniDet

Object detection on multiple datasets with an automatically learned unified label space.

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

DriveAGI

[Incl. GenAD, CVPR 2024 Highlight] Embracing Foundation Models into Autonomous Agent and System

Language:PythonLicense:Apache-2.0Stargazers:466Issues:24Issues:6

VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Language:PythonLicense:MITStargazers:449Issues:6Issues:51

esvit

EsViT: Efficient self-supervised Vision Transformers

Language:PythonLicense:MITStargazers:402Issues:12Issues:25

Awesome-Segment-Anything

A collection of project, papers, and source code for Meta AI's Segment Anything Model (SAM) and related studies.

Forge_VFM4AD

A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.

Awesome-Surrounding-Semantic-Occupancy-Prediction

Awesome papers about Multi-Camera Semantic Occupancy Prediction, such as TPVFormer, OccFormer, Occ3D, OpenOccupancy

Dalle3

An API for DALLE-3

Language:PythonLicense:MITStargazers:180Issues:6Issues:15

awesome-video-self-supervised-learning

A curated list of awesome self-supervised learning methods in videos