canqin001's starred repositories
switch-cuda
A simple bash script for switching between installed versions of CUDA.
Video-Dataset-Loading-Pytorch
Generic PyTorch dataset implementation to load and augment VIDEOS for deep learning training loops.
Online-RLHF
A recipe for online RLHF.
scaling_on_scales
When do we not need larger vision models?
MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
emoji-cheat-sheet
A markdown version emoji cheat sheet
T2I-CompBench
[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
LLaMA2-Accessory
An Open-source Toolkit for LLM Development
PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
vector-quantize-pytorch
Vector (and Scalar) Quantization, in Pytorch
Video-ChatGPT
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
DiffSynth-Studio
Enjoy the magic of Diffusion models!
Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models