data's repositories
a-PyTorch-Tutorial-to-Image-Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
attention-is-all-you-need-pytorch
A PyTorch implementation of the Transformer model in "Attention is All You Need".
Bidirectional_DALLE
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation, Stage 2
CLIP
Contrastive Language-Image Pretraining
clip-gen
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP
CogView2
official code repo for paper "CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers"
DDPM
PyTorch DDPM implementation
EVP
Code for paper 'Audio-Driven Emotional Video Portraits'.
glide-text2im
GLIDE: a diffusion-based text-conditional image synthesis model
lantern
Lantern官方版本下载 蓝灯 翻墙 代理 科学上网 外网 加速器 梯子 路由 lantern proxy vpn censorship-circumvention censorship gfw accelerator
MKGformer
Code for the SIGIR 2022 paper "Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion"
pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Semantic-Communication-Systems
pytorch implementation of "Deep Learning-Enabled Semantic Communication Systems with Task-Unaware Transmitter and Dynamic Data"
Style-AttnGAN
Improves Text to Image synthesis from AttnGAN by integrating the scale-specific control from StyleGAN; can optionally use GPT-2 as text encoder
TE-VQGAN
Translation-equivariant Image Quantizer for Bi-directional Image-Text Generation, Stage 1
Test-Git
test
Text-to-Image-ReIdentification
A pytorch re-implementation attempt of paper "Improving description-based person re-identification by multi-granularity image-text alignment." by Niu et al. (partially implemented)
Text2Video
ICASSP 2022: "Text2Video: text-driven talking-head video synthesis with phonetic dictionary"
Thin-Plate-Spline-Motion-Model
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
train-CLIP
A PyTorch Lightning solution to training OpenAI's CLIP from scratch.
Transformer-MM-Explainability
[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-based network. Including examples for DETR, VQA.
ZS-F-VQA
Code and Data for the paper: Zero-shot Visual Question Answering using Knowledge Graph [ ISWC 2021 ]