user01 / storybook-illustrator

Deep Learning tools to explore visual storytelling. :book: Read sample stories! :book:

Home Page:https://user01.github.io/storybook-illustrator/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Storybook Illustrator

This project explores visual storytelling by annotating a narrative with images. Ultimately, the images form a cohesive flow of events described in the text. We use Microsoft Research’s Visual Storytelling Dataset (VIST) to build a two part network relating images and text descriptions. Once trained, this architecture places appropriate images for unseen text narratives.

See more examples on the site.

Public domain works from Project Gutenberg include: The Jungle Book, A Tale of Two Cities, Peter Pan, Pride and Prejudice, Alice and Wonderland, A Scandal in Bohemia, Cinderella, Golden Goose, The Awakening, The Little Match Girl, and The Princess and the Pea.

Example annotation

Build Notes

Multiple models were constructed to find the best architecture to relate the text to the images. Models are annotated with the short hash from the relevant git commit. The pair of networks use a cosine loss.

Model Diagram Multiple Losses Best Loss

Jupyter

Modifying the Jupyter installation to produce .py and .html on Jupyter saves will simplify running and updating of all models. The following instructions are pulled from Jupyter Notebook Best Practices for Data Science.

First, if the ~/.jupyter/jupyter_notebook_config.py configuration file does not exist, run jupyter notebook --generate-config.

Prepend this python to the configuration file at ~/.jupyter/jupyter_notebook_config.py.

c = get_config()
### If you want to auto-save .html and .py versions of your notebook:
# modified from: https://github.com/ipython/ipython/issues/8009
import os
from subprocess import check_call

def post_save(model, os_path, contents_manager):
    """post-save hook for converting notebooks to .py scripts"""
    if model['type'] != 'notebook':
        return # only do this for notebooks
    d, fname = os.path.split(os_path)
    check_call(['jupyter', 'nbconvert', '--to', 'script', fname], cwd=d)
    check_call(['jupyter', 'nbconvert', '--to', 'html', fname], cwd=d)

c.FileContentsManager.post_save_hook = post_save

Dependencies

Python dependencies are in the pip3 file requirements.txt. Note that nltk requires data files, which are loaded by python -m nltk.downloader all. Pytorch will automatically retrieve Resnet preloadings on the first run.

Data

The Visual Storytelling Dataset (VIST) needs to be downloaded to the local disk and extracted. The location folder must be stored in a plain text file in the project root called data.directory.txt.

The final process_text.py script also requires the primitive binary to be available in the path. The source is MIT licensed and available on github. This provides the final post processing of chosen images.

Word embeddings can be downloaded from Google here and must be extracted in the root of the DATA_DIRECTORY.

The DATA_DIRECTORY needs to have the following structure:

DATA_DIRECTORY/
┣━━GoogleNews-vectors-negative300.bin
┗━┳━dii/
  ┣━sis/
  ┣━train━images━images┳━image.0.jpg
  ┃                    ┣━image.1.jpg
  ┃                    ┣━...
  ┃                    ┗━image.12.jpg
  ┗━test━━images━test━━┳━image.0.jpg
                       ┣━image.1.jpg
                       ┣━...
                       ┗━image.12.jpg
  • dii contains the Description in Isolation JSON
  • sis contains the Story in Sequence JSON
  • train contains the training images, in two subfolders (extracted train_split.*.tar.gz)
  • test contains the testing images, in two subfolders (extracted test_split.tar.gz)

The deep folder structure is an artifact of how pytorch's Image Folder considers the assets.

Image must be of size 224x244. An example ImageMagick command to conform the directory is:

mogrify -path . -resize "224x224^" -gravity center -crop 224x224+0+0 *.*

About

Deep Learning tools to explore visual storytelling. :book: Read sample stories! :book:

https://user01.github.io/storybook-illustrator/

License:MIT License


Languages

Language:Python 99.9%Language:Makefile 0.1%