amitvpatel06 / Twitter-Deep-Learning

An NLP model that uses deep learning to analysis tweet sentiment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Twitter-Deep-Learning

3 NLP models that use deep learning to analysis tweet sentiment!

I wrote 3 deep learning models(for comparison and ensemble purposes) that I trained on the same dataset: Sentiment 140's set of over 1.5 million tweets. However, they are set up in way so that they can be easily applied to any text classification task!

Model Details:

The first is a simple RNN (Recurrent Neural Network) that reads a specified number of words in the tweet and then outputs a sentiment probability vector (I kept a uniform size in the number of steps so that I could batch the computations easily).

The second is a bidirectional LSTM RNN that uses long short-term memory cells and makes a forward pass and a reverse pass of the input word vectors.

The last is a CNN(Convolutional Neural Network) that applies principals from image processing to a tweet's 2 dimensional sentence vector(each row is a word's vector!). This is based on Yoon Kim's paper: http://arxiv.org/abs/1408.5882.

Usage:

Using these models is easy! If you would like to apply them to your own datasets, you can set them up to do that. You will have to write your own code for parsing your input files and feeding them to my dataset constructor(see the utils folder for more information!). Additionally, you can set your model hyperparameters using the config class and you can set your pooling and filter layers(for the CNN) in the Filters class. I have also left sample files for my dataset and the pretrained word vectors so you can get an idea of what the input files look like when reading through my parsing code.

The LSTM actually performs the best when given extra layers and trained on the full training set, with 91% accuracy. For detailed results, see : http://www.amitpatel.me/post/3

About

An NLP model that uses deep learning to analysis tweet sentiment

License:MIT License


Languages

Language:Python 100.0%