There are 0 repository under flickr8k topic.
The objective is to process by generating textual description from an image – based on the objects and actions in the image. Using generative models so that it creates novel sentences. Pipeline type models uses two separate learning process, one for language modelling and other for image recognition. It first identifies objects in image and provides the result to the Inception-v3 model to convert into word embedding vector than into series of LSTM cells to get desired captions.
An attention based sequential deep learning model implemented in pytorch to generate single line caption given an input image
Download flickr8k, flickr30k image caption datasets
Library for training visually-grounded models of spoken language understanding.
Deep Learning Final project 2022
Exercise on captioning images in the Neural Networks for Computer Vision course. Using the Flickr8K dataset, and simple encoder-decoder architecture. Evaluation based on Cross-Entropy loss and 4-gram Bleu score.
Image Captioning using Encoder Decoder network , Pretrained models given