Implementation of "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention"
This demo uses the Theano framework.
We highly recommend using the Anaconda platform to manage all the python dependencies and virtual environments. Otherwise, you will have to manually install each of theano's dependencies.
After Anaconda is installed, run
condo install theano
- Download the data and annotations (Flickr8k, Flickr30k, Coco, etc)
- Modify data/data_generation_params.json file to indicate the location of annotation file and Image dataset.
- Resize the image to 224x224x3
- Run the data/data_generation.py to generate image and annotation pickles.
python train.py
python demo.py
A group of people stand together.
- arctic-captions for code/reference
- deep-learning tutorial for code/reference