OedoSoldier's starred repositories
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
PyTorch-VAE
A Collection of Variational Autoencoders (VAE) in PyTorch.
Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
LayerDiffuse
Transparent Image Layer Diffusion using Latent Transparency
VLM_survey
Collection of AWESOME vision-language models for vision tasks
Personalize-SAM
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds
UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Matting-Anything
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
Adversarial-Contrastive-Learning
[NeurIPS 2020] “ Robust Pre-Training by Adversarial Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang
Modality-Gap
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
ARGF_multimodal_fusion
codes for: Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion