Part 1 Sentiment Analysis with a Common Twitter Dataset
- using a Twitter dataset containing just under 45,000 tweets related to COVID-19. These data come from a fairly recent Kaggle competition.
- training classifiers to predict whether the tweet is positive, negative, or neutral,based only on the tweet itself.
Part 2 NLP with the Twitter API: Next Word Prediction
- data extraction: tweets contain word "lockdown", total of 100000 tweets is used for analysis
- data pre-processing: tokenization, removed special characters (URLs, usernames), lower-cases, lemmatization
- building machine learning model: LSTM implemented using Keras.