areebbeigh / profanityfilter

A universal Python library for detecting and filtering profanity

Home Page:https://pypi.python.org/pypi/profanityfilter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provide a way to know which words were censored

OddBloke opened this issue · comments

I would like to be able collect the words that ProfanityFilter.censor censors into my logs, so I can determine whether or not I would like to whitelist them as they come up.

This is something that I'm interested in contributing. There are two ways we could do this to work for my project:

  1. Return the censored words from the .censor() call in some way
    To avoid modifying the signature of .censor() I would propose creating a str subclass with an additional censored_words attribute; callers who don't care about it would not need to know about it.

  2. Store the censored words on the ProfanityFilter object so that they can be queried after the .censor() call
    I would propose .get_words_censored_in_session().

(Of course, there's no reason we couldn't support both.)

What do you think?

Hi!
It would be great to have the opportunity to get a list of words that are used by default, and then either extend it if needed without duplicating some words since the module deals with lists or remove (whitelist) some words.