SHI Labs's repositories
Versatile-Diffusion
Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023
Neighborhood-Attention-Transformer
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
Prompt-Free-Diffusion
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024
Matting-Anything
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
Compact-Transformers
Escaping the Big Data Paradigm with Compact Transformers, 2021 (Train your Vision Transformers in 30 mins on CIFAR-10 with a single GPU!)
Smooth-Diffusion
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
Rethinking-Text-Segmentation
[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach
Agriculture-Vision
[CVPR 2020 & 2021 & 2022 & 2023] Agriculture-Vision Dataset, Prize Challenge and Workshop: A joint effort with many great collaborators to bring Agriculture and Computer Vision/AI communities together to benefit humanity!
FcF-Inpainting
[WACV 2023] Keys to Better Image Inpainting: Structure and Texture Go Hand in Hand
Convolutional-MLPs
[Preprint] ConvMLP: Hierarchical Convolutional MLPs for Vision, 2021
Forget-Me-Not
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models, 2023
Unsupervised-Domain-Adaptation-with-Differential-Treatment
[CVPR 2020] Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation
DiSparse-Multitask-Model-Compression
[CVPR 2022] DiSparse: Disentangled Sparsification for Multitask Model Compression
OneFormer-Colab
[Colab Demo Code] OneFormer: One Transformer to Rule Universal Image Segmentation.
Boosted-Dynamic-Networks
Boosted Dynamic Neural Networks, AAAI 2023
PAIR-Diffusion
PAIR-Diffusion: Object-Level Image Editing with Structure-and-Appearance Paired Diffusion Models, 2023
LIVE-Layerwise-Image-Vectorization
[CVPR 2022 Oral] Towards Layer-wise Image Vectorization
Text2Video-Zero
a copy of "Text-to-Image Diffusion Models are Zero-Shot Video Generators", ICCV 2023
SeMask-Segmentation
[Preprint] SeMask: Semantically Masked Transformers for Semantic Segmentation.
VideoINR-Continuous-Space-Time-Super-Resolution
[CVPR 2022] VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution