rrrr's repositories
Attend-and-Excite
Official Implementation for "Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models" (SIGGRAPH 2023)
BERT-pytorch
Google AI 2018 BERT pytorch implementation
C-DSVAE
Contrastively Disentangled Sequential Variational Audoencoder
clean-fid
PyTorch - FID calculation with proper image resizing and quantization steps [CVPR 2022]
Compositional-Visual-Generation-with-Composable-Diffusion-Models-PyTorch
[ECCV 2022] Compositional Generation using Diffusion Models
crkbd
Corne keyboard, a split keyboard with 3x6 column staggered keys and 3 thumb keys.
dSEQ-VAE
BAD-VAE: A VAE framework for unsupervised disentanglement of sequential data
dsmil-wsi
DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
enhancing-transformers
An unofficial implementation of both ViT-VQGAN and RQ-VAE in Pytorch
equilib
🌎→🗾Equirectangular (360/panoramic) image processing library for Python with minimal dependencies only using Numpy and PyTorch
focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
InOut
Diverse Image Outpainting via GAN Inversion
LLaMA-VID
Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
long-video-gan
Official PyTorch implementation of LongVideoGAN
MAGE
Make It Move: Controllable Image-to-Video Generation with Text Descriptions
maskgit
Official Jax Implementation of MaskGIT
MaskGIT-pytorch
Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)
parti-pytorch
Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch
PyTorch-Simple-MaskRCNN
A PyTorch implementation of simple Mask R-CNN
stable-diffusion
A latent text-to-image diffusion model
stablediffusion-infinity
Outpainting with Stable Diffusion on an infinite canvas
stylegan2-ada-pytorch
StyleGAN2-ADA - Official PyTorch implementation
TATS
Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)
Vim
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
xdcmd
X 岛匿名版命令行客户端