Pan He's starred repositories
supervision
We write your reusable computer vision tools. 💜
nerfstudio
A collaboration friendly studio for NeRFs
multimodal-maestro
streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
humanoid-gym
Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695
S3Gaussian
Official Implementation of Self-Supervised Street Gaussians for Autonomous Driving
dreamerv3-torch
Implementation of Dreamer v3 in pytorch.
Forge_VFM4AD
A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.
World-Models-Autonomous-Driving-Latest-Survey
A curated list of world models for autonomous driving. Keep updated.
Awesome-Papers-World-Models-Autonomous-Driving
Awesome Papers about World Models in Autonomous Driving
world-models-ppo
PyTorch World Model implementation with PPO.
CityFlowER
An Efficient and Realistic Traffic Simulator with Embedded Machine Learning Models
Dragtraffic
Repo for DragTraffic: Interactive and Controllable Traffic Scene Generation for Autonomous Driving.