Chenxi's repositories
clip-guided-diffusion-pokemon
A Cog implementation of generating pixel artwork from a prompt using a diffusion model trained on pokemon sprites.
clip-guided-diffusion
A Cog implementation of CLIP Guided Diffusion
caffe
Caffe: a fast open framework for deep learning.
DeceiveD
[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
encoder4editing
Official implementation of "Designing an Encoder for StyleGAN Image Manipulation" (SIGGRAPH 2021) https://arxiv.org/abs/2102.02766
frame-interpolation
FILM: Frame Interpolation for Large Motion, In arXiv 2022.
glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
image2video-synthesis-using-cINNs
Implementation of Stochastic Image-to-Video Synthesis using cINNs.
latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
Mask2Former
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
maskgit
Official Jax Implementation of MaskGIT
maxim
[CVPR 2022 Oral] Official repository for "MAXIM: Multi-Axis MLP for Image Processing". SOTA for denoising, deblurring, deraining, dehazing, and enhancement.
merlot_reserve
Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"
NAFNet
The state-of-the-art image restoration model without nonlinear activation functions.
NeuralNeighborStyleTransfer
Optimization based style transfer
OFA
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
omnivore
Omnivore A Single Model for Many Visual Modalities
omnizart
Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
PICa
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA, AAAI 2022 (Oral)
StructuredDreaming
Repo for structured dreaming
StyleCLIP
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)
stylegan3-editing
Official Implementation of "Third Time's the Charm? Image and Video Editing with StyleGAN3" https://arxiv.org/abs/2201.13433
SWAG
Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.
TransEditor
[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
v-diffusion-jax
v objective diffusion inference code for JAX.
VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
VQGAN-CLIP
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
wiki_crosslingual
Code to reproduce the NAACL 2021 paper "Wikipedia entities as rendezvous across languages: grounding multilingual LMs by predicting wikipedia hyperlinks".