Tianhao-Qi

followers

following

stars

USTC

https://tianhao-qi.github.io/

Tianhao-Qi's starred repositories

Tora

Official repo for paper "Tora: Trajectory-oriented Diffusion Transformer for Video Generation"

1500

Kolors

Kolors Team

Language:PythonApache-2.0285200

I2V-Adapter-repo

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

Language:PythonApache-2.042300

SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Language:PythonNOASSERTION61700

VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.

Language:Python15300

InfEdit

[CVPR 2024] Official implementation of CVPR 2024 paper: "Inversion-Free Image Editing with Natural Language"

Language:PythonNOASSERTION24800

talc

Language:PythonMIT2000

Portrait-Mode-Video

Video dataset dedicated to portrait-mode video recognition.

Language:Python3100

ComfyUI_LayerStyle

A set of nodes for ComfyUI that can composite layer and mask to achieve Photoshop like functionality.

Language:PythonMIT82000

gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Language:PythonGPL-3.06286800

CV-VAE

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Language:Jupyter Notebook18700

LLaVA-NeXT

Language:Python141000

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter Notebook161600

Awesome-Animation-Research

Papers, datasets, and resources related to 2D cartoon video research. Contributions welcome.

MIT6000

V-Express

V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.

Language:Python210700

Omost

Your image is almost there!

Language:PythonApache-2.0699400

DiT-Visualization

Visualization of DiT self attention features

Language:Python9400

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonApache-2.0369300

sdxl_prompt_styler

Custom prompt styler node for SDXL in ComfyUI

Language:PythonMIT67300

VGDiffZero

[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

Language:Python800

360DVD

[CVPR2024] 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Language:Python9300

HandyFigure

HandyFigure provides the sources file (ususally PPT files) for paper figures

Language:JavaScriptMIT15200

SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]

Language:Python20400

SSM

[IJCAI-2024] The official code of Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition

800

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonNOASSERTION296500

FreeInit

[ECCV 2024] FreeInit: Bridging Initialization Gap in Video Diffusion Models

Language:PythonMIT46200

StoryDiffusion

Create Magic Story!

Language:Jupyter NotebookApache-2.0559900

MoneyPrinterTurbo

利用AI大模型，一键生成高清短视频 Generate short videos with one click using AI LLM.

Language:PythonMIT1537900

DisenDiff

[CVPR`2024, Oral] Attention Calibration for Disentangled Text-to-Image Personalization

Language:PythonMIT7400