prashant-kikani / toxic-comment-classifier

To classify toxic and abusive comments from huge bunch of text.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

toxic-comment-classifier

This project is to classify toxic and abusive comments from huge bunch of text.
I have gave training for these 6 type of toxic comments :

  1. toxic
  2. severe_toxic
  3. obscene
  4. threat
  5. insult
  6. identity_hate

It predicts probability of toxicity for above defined classes.
A comment may be classified in one or more classes. As for example a comment may be "insulting" to someone but may not be "threatning".

Download train/test data from here : https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/data

Data cleaning is also done before this training and testing. I only have kept alphabetic latters and some punctuation marks. No numbers or any other sysmbols.

We can do this task on any given chunk of data. We can pick the highest probability class to choose the type of toxicity.

Have a look at https://github.com/prashant-kikani/toxic-comment-classifier/blob/master/test.ipynb to get the clear idea.

About

To classify toxic and abusive comments from huge bunch of text.


Languages

Language:Jupyter Notebook 84.9%Language:Python 15.1%