kartikye / haikurnn

A project to generate haikus with recurrent neural networks while enforcing the 5-7-5 syllable structure

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Haiku Generation with Deep Learning (Work in Progress)

Read a full description of the project here.

This project is an attempt to use deep learning to generate haikus that conform to the 5-7-5 syllable pattern. Much previous research into generating haikus doesn't enforce syllable counts[1][2], largely because modern English haikus often don't strictly conform to that pattern either. This makes finding training data difficult. I get around this problem by providing the syllable count of each line as an input to the network along with the text at training time. Then, at generation time, I can choose how many syllables I want for each line. This project is still early, but so far I've gotten some promising results.

Here an examples of 5-7-5 syllable output:

early morning sun
from the carried garden fate
stars at the sunset

And if I use the same network to get a 10-10-10 poem:

just as the street lamp spake the sun is bright
and the soul and the spring are blowing
with every beat of my heart i will love you

Model Version 1

The first version of the model is implemented in notebooks/models/v1.

Model V1 Diagram

The model is essentially a character-to-character text generation network with a twist. The number of syllables for each line is provided to the network, passed through a dense layer and then added to the LSTM's internal state. This means that by changing the three numbers provided, we can alter the behavior of the network. My hope is that this will still allow the network to learn "English" from the whole corpus even though most of the samples are not 5–7–5 haiku, while still allowing us to generate haiku of that length later.

Repo

The notebooks directory contains the code organized into:

  • data: Jupyter notebooks for working with and preparing the data.
  • models: Jupyter notebooks and python files implementing the different models.

input contains the raw input data as well as haikus.csv which contains the whole corpus and sources.txt describes the sources used to build that corpus. Preprocess Haikus.ipynb constructs corpus.

About

A project to generate haikus with recurrent neural networks while enforcing the 5-7-5 syllable structure


Languages

Language:Jupyter Notebook 99.5%Language:Python 0.5%