minjoong507 / Image-Captioning

Image Captioning with Pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Image Captioning

Intro

  • This project is an implementation of the paper "Show and Tell: A Neural Image Caption Generator". It may not be completely similar.

  • Used Pytorch for the code. ResNet101 is used for extracting the features. You can check pre-trained models here.

  • Using COCO dataset 2017 Val images [5K/1GB], annotations [241MB].

  • Please check the make_vocab.py and data_loader.py.

    • Vocab.pickle is a pickle file which contains all the words in the annotations.
    • coco_ids.npy stores the image ID to be used. Also, you have to set the path or other settings. Execute prerocess_idx function.
  • You can run the source code and try out your own examples.

Environment

  • Python 3.8.5
  • Pytorch 1.7.1
  • cuda 11.0

How to use

  • For train
cd src
python train.py
  • For test
cd src
python sample.py

Result

  • Epoch 100

img

Caption : a woman holding a teddy bear in a suit case

TODO List

  • TensorBoard
  • Description of the model and other details
  • Code Refactoring
  • Upload requirements.txt

License

MIT License

Reference

[1] yunjey/pytorch-tutorial

About

Image Captioning with Pytorch

License:MIT License


Languages

Language:Python 100.0%