SheffieldCao

Xu CAO's starred repositories

Phased-Consistency-Model

Boosting the performance of consistency models with PCM!

Language:PythonApache-2.029700

3DGM

Official PyTorch implementation of 3D Gaussian Mapping (3DGM)

3400

OccSora

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Language:PythonApache-2.010000

MapUncertaintyPrediction

[CVPR 2024 Award Candidate] Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

Language:PythonApache-2.011500

PaSCo

[CVPR 2024 Oral - Best paper award candidate] Official repository of "PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness"

Language:PythonApache-2.012100

egtr

[CVPR 2024 Best paper award candidate] EGTR: Extracting Graph from Transformer for Scene Graph Generation

Language:PythonApache-2.03800

RepAdapter

Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".

Language:Python18700

3D-LLM

Code for 3D-LLM: Injecting the 3D World into Large Language Models

Language:PythonMIT84600

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

Language:Jupyter NotebookAGPL-3.0261700

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language:Python298900

LiDAR-Diffusion

[CVPR 2024] Official implementation of "Towards Realistic Scene Generation with LiDAR Diffusion Models"

Language:PythonMIT13600

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonApache-2.0560900

Awesome-LLM4AD

A curated list of awesome LLM for Autonomous Driving resources (continually updated)

Apache-2.075000

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT1102800

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02584400

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonApache-2.0639100

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.0253300

General-World-Models-Survey

MIT20500

LLaVA-NeXT

Language:Python111000

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookMIT1367000

FilmRemoval

[CVPR 2024] Official Implementation of Learning to Remove Wrinkled Transparent Film with Polarized Prior

Language:PythonMIT2200

mmdit

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

Language:PythonMIT18600

MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Language:PythonBSD-3-Clause45700

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonApache-2.092200

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonApache-2.0257500

LLMGA

This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024

Language:PythonApache-2.027100

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:Python74100

audacity-manual

A complete copy of the Audacity manual

Language:HTML4700

QA-ViT

Language:PythonApache-2.03000

LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF

Language:PythonGPL-3.027900