aws-ec2 aws-lambda bert bert-fine-tuning data-analysis deep-learning dvc gradio hugginface machine-learning natural-language-processing nlp preprocessing pytorch tokenizer transformers

Tweet Positivity Analyzer

Twitter is a social media network on which users post and interact with messages known as "tweets". It allows a user to post, like, and retweet tweets.

Twitter is also known for the excessive negativity or criticism by many of its users. Considering that, this application intends to classify a tweet according to its positivity.

⭐ Solution

The application uses a pre-trained BERT model fine tuned using the Coronavirus tweets NLP dataset. This dataset contains 48.000 tweets classified into the following five categories:

Extremely negative
Negative
Neutral
Positive
Extremely positive

Currently, the application is deployed using AWS, accessible in this URL.

🎥 Deployment

The following diagram represents the CD pipeline, which is currently hosted in GitHub Actions.

On the other hand, the following diagram represents an usual inference workflow. The user writes the tweet's url into the Gradio frontend, the tweet's text is scrapped and the lambda function invoked.

🚗 Roadmap

Implement a CI pipeline
Implement a CD pipeline
Implement a CT pipeline
Deploy the backend on AWS Lambda
Deploy the gradio app on an EC2 instance
Implement a monitoring solution to detect data drift

About

Application to analyze a tweet's positivity using deep learning.

aws-ec2 aws-lambda bert bert-fine-tuning data-analysis deep-learning dvc gradio hugginface machine-learning natural-language-processing nlp preprocessing pytorch tokenizer transformers

MIT License

Languages

Language:Jupyter Notebook 66.5%Language:Python 28.9%Language:HCL 2.9%Language:Dockerfile 1.7%