vzhou842 / profanity-check

A fast, robust Python library to check for offensive language in strings.

Home Page:https://pypi.org/project/profanity-check

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model must be upgraded

Cafeepy opened this issue · comments

The sklearn model used in this package was made for sklearn version 0.20.2. The latest stable version of sklearn is 0.23.2, and it is not compatible with the model from 0.20.2. When trying to run this package on sklearn 0.23 or greater, you'll encounter an unavoidable unpickling error of sorts. Yes, it is possible to install an earlier version of sklearn but versions before version 0.22 are not compatible with Python 3.8. As others have noted, there is also a significant performance decrease when using sklearn 0.22.2 and several warnings warn about lack of backwards compatibility at runtime.

If this library is to be maintained, all I ask is that the model be upgraded/retrained to be compatible with sklearn 0.23.2 and Python 3.8, or at least the code/data used to train the original model be provided to allow others to retrain the model themselves. Going forward into Python 3.9 and greater, this library will unfortunately fall into deprecation unless this happens.

I really admire this library, honestly all it needs now is a bit of polishing. Thanks for your time!

I found an article the author wrote. It contains information about datasets he uses.
https://towardsdatascience.com/building-a-better-profanity-detection-library-with-scikit-learn-3638b2f2c4c2

commented

@Cafeepy this article contains the script using which the model was trained.
Found this link the pypi of profanity-check.