bryantwong / shorties

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Author Mashup (for Algorithmia Shorties Contest)

Ever wondered what Plato's "Republic" would be like if James Joyce had written it? Or if Plato had written Joyce's "Dubliners?" Or maybe if Jane Austen had writen Joyce's "Ulysses?" (We just really like Joyce.) Our idea was to generate short stories that were mashups of any two authors (or two corpuses in general) by using one corpus to generate a sentence structure for the story, and the other corpus to provide vocabulary for the story to use.

How It Works

Using the scripts is a bit of a pain, since it's a bit hacked together, but if you want to try it out, feel free to clone the repo! You will need nltk as well as your own Algorithmia API key. It all runs in command line - look at the bottom of each Python script to see what arguments are needed.

You want to run corpus_to_tags.py on both corpuses as this does our part of speech taggings, as well as generating counts for the frequencies of different part of speech.

Then, run generate_sentences.py. This heavily uses the Algorithmia API (namely GenerateParagraphFromTrigram and GenerateTrigramFrequencies) to create a story that mimics the structure of one of the corpuses. Notice this story is written entirely in part of speech tags and isn't so fun to read.

Finally, run convert_vocab.py. This utilizes the mappings and frequency counts from both to pick the most commonly used words of each type in the second corpus and insert them into the structure created by the first corpus, making a mashup. Then, read your story and enjoy how nonsensical it is.

Caveats

To make your story more readable, avoid:

  1. Mashing up authors from different eras. The structure and most common parts of speech from different eras tend to be very different and you get some very odd sentences. The mapping tends not to work so well in those cases.

  2. Authors that use non-traditional words or non-traditional structures. Probably shouldn't have picked Joyce... At least we learned his structure and style is hard to emulate.

Sample Works (read in edit mode due to lack of word-wrap):

Republic structure w/Dubliners vocabulary

Dubliners structure w/Republic vocabulary

Ulysses structure w/ Pride and Prejudice vocabulary

Thanks and enjoy!

About

License:GNU General Public License v3.0


Languages

Language:Python 100.0%