Tianhao-Qi's starred repositories
Pyramid-Flow
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
CogVideoX-Fun
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
Vchitect-2.0
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
I2V-Adapter-repo
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models
SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
Portrait-Mode-Video
Video dataset dedicated to portrait-mode video recognition.
ComfyUI_LayerStyle
A set of nodes for ComfyUI that can composite layer and mask to achieve Photoshop like functionality.
gpt_academic
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
Awesome-Animation-Research
Papers, datasets, and resources related to 2D cartoon video research. Contributions welcome.
DiT-Visualization
Visualization of DiT self attention features
sdxl_prompt_styler
Custom prompt styler node for SDXL in ComfyUI
VGDiffZero
[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders