Python project
Created a python application for classification of data as racist/sexist comment or not.
- Gathering and cleaning: - Scraped data from twitter using tweepy library in Python, which communicates with the twitter API and built functions to clean, parse and tokenize the tweets.
- Visualization and Extraction: - Cleaned data is visualized in WordCloud format and extracted features for the same using natural language processing.
- Classification:- Built a Naïve Bayes classifier from the scikit-learn library to detect the nature of tweets from the set of features extracted in Python.
*For training dataset used analytics vidhya twitter data which contains labeled 31962 tweets. Also, built a classifier using linear SVC.