Arsh2k01/UTrack

tweepy streamlit deep-learning nltk-library seaborn python bert-model natual-language-processing sentiment-analysis regex mental-health mental-health-awareness tensorflow social-media twitter-sentiment-analysis twitter-api

Project under Consulting and Analytics Club, IITG

1. Technologies Used

Tweepy API
NLTK
BERT Model
Tensorflow
Seaborn
Streamlit

2. Project Description

2.1 Data Extraction and Preprocessing

We scraped data for each illness using the Tweepy API, based on keywords and phrases for each category. Additionally, we scraped tweets that didn't contain these keywords. This data acted as the ‘neutral’ data. The data was cleaned using libraries like Regex and NLTK. Links, emojis, emoticons, and symbols were removed.

2.2 DL Model

We explored Transformer models and found that BERT(Bidirectional Encoder Representations from Transformers) was better-suited for sentiment analysis. We used a pretrained BERT model and fine-tuned it on our training data. We trained a model for each class.
The output given by the final layer was not fed to any activation function; it was instead given as input to a custom function to normalize and standardize the data. The function is given below:

2.3 Visualisation and Deployment

We used Seaborn to display the calculated level of Loneliness, Stress, and Anxiety for each user across time, thus enabling us to see how the user's mental state varied over time. Moreover, we estimate the weighted average for each category, over previous tweets [0:LOW,1:HIGH]. Additionally, you can also view each specific tweet and its scores. Deployment was done using Streamlit.

3. Files

Cleaning Tweets.py - Script to clean scraped tweets
Extracting Targeted Tweets.py - Script to scrape a user's Twitter information
Streamlit Deployment.py - Script to deploy the project
Streamlit Deployment.ipynb - Jupyter Notebook to deploy the project
Extracted Tweets - Training Data
Training Models:
- Anxiety Model.py
- Lonely Model.py
- Stress Model.py

4. Usage

To use UTrack, first add this folder to your Google Drive.
Then run Streamlit Deployment.ipynb on Google Colab. Click on the ngrok link produced by the .ipynb file.

Once you go to the localhost, use the following video as a reference:

5. Team

6. References

7. License

MIT

About

UTrack analyses the user's tweets and finds the level of Loneliness, Stress, and Anxiety, and their trends over time

tweepy streamlit deep-learning nltk-library seaborn python bert-model natual-language-processing sentiment-analysis regex mental-health mental-health-awareness tensorflow social-media twitter-sentiment-analysis twitter-api

MIT License

Languages

Language:Python 64.0%Language:Jupyter Notebook 36.0%