Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.
PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Exploring the Limits of Masked Visual Representation Learning at Scale (https://arxiv.org/abs/2211.07636)
The official implementation of "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
[CVPR 2023] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Denoising Diffusion Implicit Models
Learning Guided Diffusion
MIMDet: Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
(CVPR 2023) Seeing a Rose in Five Thousand Ways
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Official implementation of the paper "Segment Everything Everywhere All at Once"
Personal Learning Version