Doesn't understand context

Question

Doesn't understand context

hwsamuel opened this issue 4 years ago · comments

The library seems to be working more like a dictionary look up for swear words. For example, it can correctly tag "fucking idiot" as negative, but also tags "fucking awesome!" as negative. Maybe the training set's features were uni-grams?

Menelaos Kotoglou · Answer 1 · Mon Nov 30 2020 20:06:39 GMT+0800 (China Standard Time)

From my point of view, that happens because of the learning algorithm the library uses. By tokenizing each word, "fucking" gets a huge probability of being profane, since it is profane in any context. For example, you cannot say "fucking awesome!" in a professional environment. If you place "fucking awesome!" in clean_data.csv, you will label as 1 (profane), not 0(not profane).