yucornetto

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Language:Python396 21 17

open-muse

Open reproduction of MUSE for fast text2image generation.

Language:PythonApache-2.0294 38 27

EDICT

Language:Jupyter NotebookBSD-3-Clause275 6 14

fc-clip

[NeurIPS 2023] This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

Language:PythonApache-2.0260 16 28

DETA

Detection Transformers with Assignment

Language:PythonApache-2.0233 5 25

3D-TransUNet

This is the official repository for the paper "3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers"

Language:PythonApache-2.0156 3 30

CLIPSelf

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

Language:PythonNOASSERTION144 6 21

ViTamin

[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"

Language:PythonApache-2.0131 5 8

qa-lora

Official PyTorch implementation of QA-LoRA

Language:PythonMIT94 4 32

OmniScient-Model

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

Language:Jupyter NotebookApache-2.088 10 4

V3Det

Language:Python84 8 29

kmax-deeplab

a PyTorch re-implementation of ECCV 2022 paper based on Detectron2: k-means mask Transformer.

Language:PythonApache-2.064 7 3

MaXTron

This repo contains the code for our paper MaXTron: Mask Transformer with Trajectory Attention for Video Panoptic Segmentation

Language:PythonApache-2.026 6 2