Find word from document in word list containing word roots
NelsonPython opened this issue · comments
The original detector attempted to find a word in a list of words. But the list of words often had the root, for example "compet" is in the list. The word from the document is "competition". "Competition" is longer than "compet" so you cannot find "competition" in "compet". But you can find "compet" in "competition".
This is a code snippet that worked for me:
found = False
for word, start, stop in token_indices:
# NELSON - changed loop so word can be found in GENDERED_WORDS
if word.lower() in GENDERED_WORDS:
found = True
report.add_flag(
Flag(start, stop, Issue(
"{word}".format(word=word)
))
)