ShihaoZhaoZSH

Shihao Zhao's starred repositories

llama

Inference code for Llama models

Language:PythonNOASSERTION55199 517 951

Fooocus

Focus on prompting and generating

Language:PythonGPL-3.039541 298 1484

labelme

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Language:PythonNOASSERTION13024 149 738

AnimateDiff

Official implementation of AnimateDiff.

Language:PythonApache-2.010125 105 341

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookApache-2.06907 59 138

semantic-segmentation-pytorch

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Language:PythonBSD-3-Clause4903 125 240

MiniGemini

Official implementation for Mini-Gemini

Language:PythonApache-2.02711 23 75

interactive-deep-colorization

Deep learning software for colorizing black and white images with a few clicks.

Language:PythonMIT2685 122 84

pytorch-ssim

pytorch structural similarity (SSIM) loss

Language:PythonNOASSERTION1859 21 36

sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies

Language:PythonNOASSERTION1280 26 132

animatediff-cli-prompt-travel

animatediff prompt travel

Language:PythonApache-2.01181 20 235

awesome-diffusion-categorized

collection of diffusion model papers categorized by their subareas

1058 52 14

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonApache-2.01027 42 42

torch-fidelity

High-fidelity performance metrics for generative models in PyTorch

Language:PythonNOASSERTION959 7 35

All-In-One-Deflicker

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Language:Python680 23 34

Text-To-Video-Finetuning

Finetune ModelScope's Text To Video model using Diffusers 🧨

Language:PythonMIT654 18 68

colorization-pytorch

PyTorch reimplementation of Interactive Deep Colorization

Language:PythonMIT597 18 24

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookNOASSERTION479 15 33

LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)

Language:Python404 13 19

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT299 16 16

layout-guidance

[WACV 2024] Training-Free Layout Control with Cross-Attention Guidance

Language:Python227 4 21

LocalizingMoments

Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"

Language:OpenEdge ABL188 11 21

T2I-CompBench

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Language:PythonMIT180 2 19

Deep-Video-Super-Resolution

The state-of-the-art VSR

98 7 1

unicolor

This is the implementation of paper: "UniColor: A Unified Framework for Multi-Modal Colorization with Transformer"

Language:Python56 2 9

simple-aesthetics-predictor

CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.

Language:PythonMIT26 3 7

timelapse_deflickerer

Language:PythonMIT8 20

CiPR

[TMLR] CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

Language:PythonMIT8 30

deflicker-timelapse-video

Language:Python2 10

Keypoint_Evaluation

Language:Python1 20