Shihao Zhao's starred repositories

llama

Inference code for Llama models

Language:PythonLicense:NOASSERTIONStargazers:55199Issues:517Issues:951

Fooocus

Focus on prompting and generating

Language:PythonLicense:GPL-3.0Stargazers:39541Issues:298Issues:1484

labelme

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).

Language:PythonLicense:NOASSERTIONStargazers:13024Issues:149Issues:738

AnimateDiff

Official implementation of AnimateDiff.

Language:PythonLicense:Apache-2.0Stargazers:10125Issues:105Issues:341

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6907Issues:59Issues:138

semantic-segmentation-pytorch

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Language:PythonLicense:BSD-3-ClauseStargazers:4903Issues:125Issues:240

MiniGemini

Official implementation for Mini-Gemini

Language:PythonLicense:Apache-2.0Stargazers:2711Issues:23Issues:75

interactive-deep-colorization

Deep learning software for colorizing black and white images with a few clicks.

Language:PythonLicense:MITStargazers:2685Issues:122Issues:84

pytorch-ssim

pytorch structural similarity (SSIM) loss

Language:PythonLicense:NOASSERTIONStargazers:1859Issues:21Issues:36

sd-webui-text2video

Auto1111 extension implementing text2video diffusion models (like ModelScope or VideoCrafter) using only Auto1111 webui dependencies

Language:PythonLicense:NOASSERTIONStargazers:1280Issues:26Issues:132

animatediff-cli-prompt-travel

animatediff prompt travel

Language:PythonLicense:Apache-2.0Stargazers:1181Issues:20Issues:235

awesome-diffusion-categorized

collection of diffusion model papers categorized by their subareas

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonLicense:Apache-2.0Stargazers:1027Issues:42Issues:42

torch-fidelity

High-fidelity performance metrics for generative models in PyTorch

Language:PythonLicense:NOASSERTIONStargazers:959Issues:7Issues:35

All-In-One-Deflicker

[CVPR2023] Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Text-To-Video-Finetuning

Finetune ModelScope's Text To Video model using Diffusers 🧨

Language:PythonLicense:MITStargazers:654Issues:18Issues:68

colorization-pytorch

PyTorch reimplementation of Interactive Deep Colorization

Language:PythonLicense:MITStargazers:597Issues:18Issues:24

LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:479Issues:15Issues:33

LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonLicense:MITStargazers:299Issues:16Issues:16

layout-guidance

[WACV 2024] Training-Free Layout Control with Cross-Attention Guidance

LocalizingMoments

Github for my ICCV 2017 paper: "Localizing Moments in Video with Natural Language"

T2I-CompBench

[Neurips 2023] T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Language:PythonLicense:MITStargazers:180Issues:2Issues:19

unicolor

This is the implementation of paper: "UniColor: A Unified Framework for Multi-Modal Colorization with Transformer"

simple-aesthetics-predictor

CLIP-based aesthetics predictor inspired by the interface of 🤗 huggingface transformers.

Language:PythonLicense:MITStargazers:26Issues:3Issues:7
Language:PythonLicense:MITStargazers:8Issues:2Issues:0

CiPR

[TMLR] CiPR: An Efficient Framework with Cross-instance Positive Relations for Generalized Category Discovery

Language:PythonLicense:MITStargazers:8Issues:3Issues:0