rese1f

Wenhao Chai's repositories

StableVideo

[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Language:PythonApache-2.01335 20 23

MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Language:PythonBSD-3-Clause408 10 57

Awesome-VQVAE

📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application

MIT134 8 1

CityGen

🏙️🌆🌃 Try Infinite and Controllable 3D City Layout Generation!

MIT29 5 4

STEVE

⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment

MIT2703

Awesome-DriveLM

📚 A collection of resources and papers on Large Language Models in autonomous driving

MIT19 2 1

PoseDA

[ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation

MIT19 6 2

UniAP

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

Language:PythonMIT9 40

old_web

personal website built on beautiful jekyll, feel free to clone and modify

Language:HTMLMIT300

UniVHP

Unified Human-centric Perception Model and Benchmark in Sports

2 30

arxiv-daily

🎓 Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonMIT100

Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

MIT100

3D-VisTA

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

MIT000

all-seeing

This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

010

awesome-3D-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

MIT000

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

000

Awesome-Long-Context

A curated list of resources about long-context in large-language models and video understanding.

000

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

000

Awesome-Multimodal-Large-Language-Models

Latest Papers and Datasets on Multimodal Large Language Models

000

awesome-NeRF

A curated list of awesome neural radiance fields papers

Language:TeXMIT000

Awesome-Skeleton-based-Action-Recognition

A curated paper list of awesome skeleton-based action recognition.

MIT000

ED-Pose

[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "

Apache-2.0000

LLaMA-Efficient-Tuning

Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)

Apache-2.0000

LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

000

minisora

The Mini Sora project aims to explore the implementation path and future development direction of Sora.

000

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

000

OpenScene

3D Occupancy Prediction Benchmark in Autonomous Driving

Language:PythonApache-2.0010

rese1f

Config files for my GitHub profile.

000

rese1f

Wenhao Chai's repositories

StableVideo

MovieChat

Awesome-VQVAE

CityGen

STEVE

Awesome-DriveLM

PoseDA

UniAP

old_web

UniVHP

arxiv-daily

Awesome-LLM-3D

3D-VisTA

all-seeing

awesome-3D-gaussian-splatting

Awesome-Foundation-Models

Awesome-Long-Context

Awesome-MLLM-Hallucination

Awesome-Multimodal-Large-Language-Models

awesome-NeRF

Awesome-Skeleton-based-Action-Recognition

DriveLM

ED-Pose

ipl-uw.github.io

LLaMA-Efficient-Tuning

LLM-Agent-Paper-List

minisora

Multi-Modality-Arena

OpenScene

rese1f