zhangshushu15's starred repositories
clip-as-service
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
PhotoMaker
PhotoMaker [CVPR 2024]
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
progressive_growing_of_gans
Progressive Growing of GANs for Improved Quality, Stability, and Variation
adversarial
Code and hyperparameters for the paper "Generative Adversarial Networks"
stable-diffusion
Latent Text-to-Image Diffusion
open_flamingo
An open-source framework for training large multimodal models.
RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)
InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
DenoisingDiffusionProbabilityModel-ddpm-
This may be the simplest implement of DDPM. You can directly run Main.py to train the UNet on CIFAR-10 dataset and see the amazing process of denoising.
style-aligned
Official code for "Style Aligned Image Generation via Shared Attention"
visual_anagrams
Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"
pytorch_diffusion
PyTorch reimplementation of Diffusion Models
BrowserGym
BrowserGym, a gym environment for web task automation in the Chromium browser.