There are 6 repositories under image-text-matching topic.
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
Offline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine
Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)
A non-JIT version implementation / replication of CLIP of OpenAI in pytorch
Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"
Code implementation of paper "SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text Retrieval" (ACM TOMM 2024).
Text Query based Traffic Video Event Retrieval with Global-Local Fusion Embedding
Unofficial code of paper "Improving description-based person re-identification by multi-granularity image-text alignment." by Niu et al. (partially implemented)
A list of research papers on knowledge-enhanced multimodal learning
A dead-simple image search and image-text matching system for Bangla using CLIP
BSs Graduation Project implementation [Image-Text Matching]
CLIP (Contrastive Language–Image Pre-training) for Bangla.
The 3rd place solution code for the Wikipedia - Image/Caption Matching Competition on Kaggle
Image-Text Matching Model Zoo
The Unified Code of Image-Text Retrieval for Further Exploration.
Python Implementation of lexical vector embedding similarity scoring, zero-shot classification of images and n-gram based scoring to compare textual summaries