Xu CAO (SheffieldCao)

SheffieldCao

Geek Repo

Company:Tongji Univ

Location:Shanghai, China

Home Page:xucaotju@gmail.com

Github PK Tool:Github PK Tool

Xu CAO's starred repositories

Phased-Consistency-Model

Boosting the performance of consistency models with PCM!

Language:PythonLicense:Apache-2.0Stargazers:297Issues:0Issues:0

3DGM

Official PyTorch implementation of 3D Gaussian Mapping (3DGM)

Stargazers:34Issues:0Issues:0

OccSora

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:100Issues:0Issues:0

MapUncertaintyPrediction

[CVPR 2024 Award Candidate] Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

Language:PythonLicense:Apache-2.0Stargazers:115Issues:0Issues:0

PaSCo

[CVPR 2024 Oral - Best paper award candidate] Official repository of "PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness"

Language:PythonLicense:Apache-2.0Stargazers:121Issues:0Issues:0

egtr

[CVPR 2024 Best paper award candidate] EGTR: Extracting Graph from Transformer for Scene Graph Generation

Language:PythonLicense:Apache-2.0Stargazers:38Issues:0Issues:0

RepAdapter

Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".

Language:PythonStargazers:187Issues:0Issues:0

3D-LLM

Code for 3D-LLM: Injecting the 3D World into Large Language Models

Language:PythonLicense:MITStargazers:846Issues:0Issues:0

Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:2617Issues:0Issues:0

Pretrained-Language-Model

Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.

Language:PythonStargazers:2989Issues:0Issues:0

LiDAR-Diffusion

[CVPR 2024] Official implementation of "Towards Realistic Scene Generation with LiDAR Diffusion Models"

Language:PythonLicense:MITStargazers:136Issues:0Issues:0

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5609Issues:0Issues:0

Awesome-LLM4AD

A curated list of awesome LLM for Autonomous Driving resources (continually updated)

License:Apache-2.0Stargazers:750Issues:0Issues:0

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookLicense:MITStargazers:11028Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:25844Issues:0Issues:0

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:6391Issues:0Issues:0

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:AGPL-3.0Stargazers:2533Issues:0Issues:0
Language:PythonStargazers:1110Issues:0Issues:0

pykan

Kolmogorov Arnold Networks

Language:Jupyter NotebookLicense:MITStargazers:13670Issues:0Issues:0

FilmRemoval

[CVPR 2024] Official Implementation of Learning to Remove Wrinkled Transparent Film with Polarized Prior

Language:PythonLicense:MITStargazers:22Issues:0Issues:0

mmdit

Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch

Language:PythonLicense:MITStargazers:186Issues:0Issues:0

MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Language:PythonLicense:BSD-3-ClauseStargazers:457Issues:0Issues:0

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:922Issues:0Issues:0

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2575Issues:0Issues:0

LLMGA

This project is the official implementation of 'LLMGA: Multimodal Large Language Model based Generation Assistant', ECCV2024

Language:PythonLicense:Apache-2.0Stargazers:271Issues:0Issues:0

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:PythonStargazers:741Issues:0Issues:0

audacity-manual

A complete copy of the Audacity manual

Language:HTMLStargazers:47Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:30Issues:0Issues:0

LLaVA-RLHF

Aligning LMMs with Factually Augmented RLHF

Language:PythonLicense:GPL-3.0Stargazers:279Issues:0Issues:0