There are 2 repositories under mscoco-dataset topic.
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
An easy implementation of Faster R-CNN (https://arxiv.org/pdf/1506.01497.pdf) in PyTorch.
An easy implementation of FPN (https://arxiv.org/pdf/1612.03144.pdf) in PyTorch.
Real-time semantic image segmentation on mobile devices
Using LSTM or Transformer to solve Image Captioning in Pytorch
A Clone version from Original SegCaps source code with enhancements on MS COCO dataset.
Pytorch implementation of image captioning using transformer-based model.
Adds SPICE metric to coco-caption evaluation server codes
Convert segmentation binary mask images to COCO JSON format.
PyTorch implementation of paper: "Self-critical Sequence Training for Image Captioning"
The pytorch implementation on “Fine-Grained Image Captioning with Global-Local Discriminative Objective”
We aim to generate realistic images from text descriptions using GAN architecture. The network that we have designed is used for image generation for two datasets: MSCOCO and CUBS.
Clone of COCO API - Dataset @ http://cocodataset.org/ - with changes to support Windows build and python3
A demo for mapping class labels from ImageNet to COCO.
Preserving Semantic Neighborhoods for Robust Cross-modal Retrieval [ECCV 2020]
MS COCO captions in Arabic
Image caption generation using GRU-based attention mechanism
Caption generation from images using topics as additional guiding inputs.
Karpathy Splits json files for image captioning
An end-to-end vision and language model incorporating explicit knowledge graphs and OOD-detection.
Microsoft COCO: Common Objects in Context for huggingface datasets
Mixed vision-language Attention Model that gets better by making mistakes
NLP - descriptive statistics of COCO annotations via Python COCO-API
Analysis of Image Captioning Models
Augment the MS COCO training set while training NIC
Object Detection Dataset Format Converter
Create a YOLO-format subset of the COCO dataset
Object-Detection API using MSCOCO dataset & using customized dataset from tensorflow
Image captioning with pretrained encoder on MSCOCO.
A simple Python API (built on top of TensorFlow) for neural image captioning with MSCOCO data.
Code Repository for "A New Unified Method for Detecting Text from Marathon Runners and Sports Players in Video" [Pattern Recognition, Elsevier 2020]
A helper library for easily converting MSCOCO format data using the loading script of huggingface datasets.
COCO-Stuff dataset for huggingface datasets