mikasenghaas / twitter-hatespeech-detection

Bag-of-words tweet hate-speech detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FYP 04: Natural Language Processing

Sentiment Analyis of Tweet Data using Machine Learning in NLP


Group 9: Aidan Stocks, Hugo Reinicke, Nicola Clark, Jonas-Mika Senghaas

Project Description

In this project, you will learn how to work with natural language data. You will learn...

  • ...what makes natural language different from other types of data,

  • ...how to prepare text data for automatic processing,

  • ...how to annotate data for supervised classification, and

  • ...how to train and run a classifier for a basic NLP task.

Background and Motivation

Social media is omnipresent in today's world. We use messengers to communicate, share pictures, music, thoughts - in short - our life on the internet with people that are close, and maybe also not as close to us. Twitter is one of those social networks. The american social networking service allows its users to post and interact with messages known as through so-called tweets. 280 character postings on the online-service that can be liked, commented, threaded and shared. Since its launch in 2006, Twitter has grown massively, nowadays reporting hundreds of million of users. Besides its diverse utilisation, Twitter is especially known as a platform for political discussion. Both politicians and society use Twitter as a channel to take positions in political debates and express opinions. While this is desirable and embracing the idea of free-speech on the internet, the question of whether or not Twitter should use tools to automatically detect unwanted content from its platform, such as racism, sexism, false information or hate speech, is a subject of on-going public debate. This project, in a first instance, sets aside the ethical challenges and questions arising, and solely focuses on the technical details of how such a solution might work. The goal of this project is to optimise a machine learning model to automatically detect unwanted content.

About

Bag-of-words tweet hate-speech detection

License:MIT License


Languages

Language:Jupyter Notebook 100.0%