There are 2 repositories under zero-shot-classification topic.
An open source implementation of CLIP.
Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models like Grounding DINO and SAM.
🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Diffusion Classifier leverages pretrained diffusion models to perform zero-shot classification without additional training
Cybertron: the home planet of the Transformers in Go
official code of “OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding”
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
PyTorch code for MUST
Multi-Aspect Vision Language Pretraining - CVPR2024
Official PyTorch Implementation of MSDN (CVPR'22)
[TPAMI 2023] Generative Multi-Label Zero-Shot Learning
[ICML 2024] "Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models"
Open-source code for the paper "Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification"
Implementation of Z-BERT-A: a zero-shot pipeline for unknown intent detection.
Alternate Implementation for Zero Shot Text Classification: Instead of reframing NLI/XNLI, this reframes the text backbone of CLIP models to do ZSC. Hence, can be lightweight + supports more languages without trading-off accuracy. (Super simple, a 10th-grader could totally write this but since no 10th-grader did, I did) - Prithivi Da
Evaluate custom and HuggingFace text-to-image/zero-shot-image-classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. Metrics include Zero-shot accuracy, Linear Probe, Image retrieval, and KNN accuracy.
Codes for the experiments in our EMNLP 2021 paper "Open Aspect Target Sentiment Classification with Natural Language Prompts"
Airflow Pipeline for Machine Learning
A minimal, but effective implementation of CLIP (Contrastive Language-Image Pretraining) in PyTorch
Low-latency ONNX and TensorRT based zero-shot classification and detection with contrastive language-image pre-training based prompts
Actor-agnostic Multi-label Action Recognition with Multi-modal Query [ICCVW '23]
Deep Learning for Computer Vision 深度學習於電腦視覺 by Frank Wang 王鈺強
Code for EMNLP2019 paper : "Benchmarking zero-shot text classification: datasets, evaluation and entailment approach"
Perform topic classification on news articles in several limited-labeled data regimes.
GPT-4o (with Vision) module for use with Autodistill.
A hub hosting essential remote sensing datasets.
bullet: A Zero-Shot / Few-Shot Learning, LLM Based, text classification framework