Cyberbullying-Detection-using-Machine-Learning

Cyberbullying Detection: Identifying Hate Speech using Machine Learning

Description

Bullying has been prevalent since the beginning of time, It’s just the ways of bullying which have changed over the years, from physical bullying to cyberbullying. Due to the massive rise of user-generated web content, particularly on social media networks, the amount of hate speech is steadily increasing. Hate speech online has been linked to a global increase in violence toward minorities, including mass shootings, lynchings, and ethnic cleansing.

This project presents a systematic review of some published research on cyberbullying detection approaches and examine methods to detect hate speech in social media, while distinguishing this from general profanity, and does a comparative study of various Supervised algorithms, including standard, as well as ensemble methods.

Dataset

Tweets Dataset for Detection of Cyber-Trolls obtained from DataTurks
Data Cleaning, Preprocessing (Word Tokenization, Stemming, TF-IDF), and Resampling was done before application of any of the Machine Learning algorithms used.

Methods Used

Gaussian Naive Bayes
Logistic Regression
Decision Tree Classifier
Adaboost Classifier
Random Forest Classifier

Result

The evaluation of the result shows that Ensemble supervised methods have the potential to perform better than traditional supervised methods. A number of directions for future work are also discussed.

Documentation

Project Report

Authors

_{Kirtik Singh}

_{Prakhar Bhasin}

_{Dev Kathuria}

_{Ishank Nijhawan}

About

Cyberbullying Detection: Identifying Hate Speech using Machine Learning

Languages

Language:Jupyter Notebook 100.0%