data-science deep-learning gradio keras lstm machine-learning natural-language-processing neural-network nlp nlp-machine-learning prediction python sentiment-analysis tweet twitter

Disaster Tweet Prediction

Twitter has become an important communication channel in times of emergency. The ubiquitousness of smartphones enables people to announce an emergency they’re observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies). Therefore, in this task I am prediction whether a given tweet is about a real disaster or not. If so, predict a 1. If not, predict a 0.

Installation

Downloading the Data

Clone this repository to your computer
Navigate to the project directory cd twitter-sentiment-analysis from your terminal
run mkdir inputs
use cd inputs to go into the directory where data should be stored
Download the data files from Kaggle
- Data can be found here
- If you don't have a Kaggle account you'd have to create one

Installing the requirements

Install the requirements using pip install -r requirements
- The python version is Python 3.8
- You're better off using virtual environment

Usage

Navigate to the src directory using cd src in the project folder
- Then run python train.py
- This will train an LSTM and create a directory with the models directory called PRETRAIN_WORD2VEC_LSTM with the serialized LSTM and tokenizer inside it.
- Once you've trained the model, you could run your own examples by running the user_interface.py script in the top level directory. this will provide you with a private link. Once selected, input some text that you'd like to determine whether it's a disaster or not.
View all explorations in notebook directory

Extending This Work

Some ideas to extend this work:

Methods to reduce inference time
Use Different word embeddings
Try LSTM with attention (See Attention in Long Short-Term Memory Recurrent Neural Networks)
Use a transformer model
Correct misspelled words
Dealing with overfitting

Write Ups about This Project

About

Creating a Gradio user interface to predict the sentiment of a tweet

data-science deep-learning gradio keras lstm machine-learning natural-language-processing neural-network nlp nlp-machine-learning prediction python sentiment-analysis tweet twitter

Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%