There are 10 repositories under image-caption topic.
VisualGPT, CVPR 2022 Proceeding, GPT as a decoder for vision-language models
A neural network to generate captions for an image using CNN and RNN with BEAM Search.
CLIPxGPT Captioner is Image Captioning Model based on OpenAI's CLIP and GPT-2.
[CVPR23] A cascaded diffusion captioning model with a novel semantic-conditional diffusion process that upgrades conventional diffusion model with additional semantic prior.
Paper notes in deep learning/machine learning and computer vision
An Image captioning web application combines the power of React.js for front-end, Flask and Node.js for back-end, utilizing the MERN stack. Users can upload images and instantly receive automatic captions. Authenticated users have access to extra features like translating captions and text-to-speech functionality.
A Python base cli tool for caption images with WD series, Joy-caption-pre-alpha,meta Llama 3.2 Vision Instruct and Qwen2 VL Instruct models.
Tensorflow implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention" Support python3.6, python3.7 TensorFlow1.8 TensorFlow1.12 TensorFlow1.13 TensorFlow1.14 numpy 1.12 or newer
pre-trained model and source code for generate description of images.
[IGARSS 2022] CapFormer: Pure transformer for remote sensing image caption
Image Caption
Image captioning project.
a py3 lib for NLP & image-caption metrics : BLEU METEOR CIDEr ROUGE SPICE WMD
Simple but Comprehensive PyTorch Implementation of Image Captioning Models.
This repository reimplements "Show, Attend and Tell" model and add extra deep learning techniques.
Transformer block in tf.keras similar to PyTorch's nn.Transformer block.
Using image caption models to extract prompts in ComfyUI
End to End Deep learning model that generate image captions
Karpathy Splits json files for image captioning
Say good bye to jQuery plugins. Today, we can create similar image caption effect only with CSS3. This demo shows how this effects runs.
Allows to generate image caption information from annotations
PyTorch implementation of image captioning based on attention mechanism
[ECCV24] Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
This is an innovative project aimed at enhancing the visual experience for individuals with impairments. Leveraging machine learning and natural language processing, this repository houses the codebase for generating efficient and coherent natural language descriptions of captured images. The project integrates seamlessly with image recognition,
Blip 2 Captioning, Mass Captioning, Question Answering, and other tools.
Image Descriptor with Visual Attention Mechanism Using Long Short-term Memory
A Mindspore Implementation of paper "Show and Tell : Neural Image Caption Generation"
A subset of Google's ConceptualCaptions(3M) dataset which include 940k samples.
Image caption using VGG16 + LSTM
Major Project Repository
A Mindspore Implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention".
Pytorch Image-caption retrieval model