vzhou842 / profanity-check

A fast, robust Python library to check for offensive language in strings.

Home Page:https://pypi.org/project/profanity-check

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doesn't understand context

hwsamuel opened this issue · comments

The library seems to be working more like a dictionary look up for swear words. For example, it can correctly tag "fucking idiot" as negative, but also tags "fucking awesome!" as negative. Maybe the training set's features were uni-grams?

From my point of view, that happens because of the learning algorithm the library uses. By tokenizing each word, "fucking" gets a huge probability of being profane, since it is profane in any context. For example, you cannot say "fucking awesome!" in a professional environment. If you place "fucking awesome!" in clean_data.csv, you will label as 1 (profane), not 0(not profane).