journey0621 / headlines

Automatically generate headlines to short articles

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatically generate headlines to short articles

Twitter followers

This project attempts to reproduce the results in the paper: Generating News Headlines with Recurrent Neural Networks

It is assumed that you already have training and test data. The data is made from many examples (I'm using 684K examples), each example is made from the text from the start of the article, which I call description (or desc), and the text of the original headline (or head). The texts should be already tokenized and the tokens separated by spaces.

The vocabulary-embedding notebook describes how a dictionary is built for the tokens and how an initial embedding matrix is built from GloVe

train notebook describes how a model is trained on the data using Keras

predict generate headlines by the trained model and showes the attention weights used to pick words from the description. The text generation includes a feature which was not described in the original paper, it allows for words that are outside the training vocabulary to be copied from the description to the generated headline.

Examples of headlines generated

Good (cherry picking) examples of headlines generated cherry picking of generated headlines cherry picking of generated headlines

Examples of attention weights

attention weights

About

Automatically generate headlines to short articles

License:MIT License


Languages

Language:Jupyter Notebook 100.0%