Find word from document in word list containing word roots

Question

Find word from document in word list containing word roots

NelsonPython opened this issue 5 years ago · comments

The original detector attempted to find a word in a list of words. But the list of words often had the root, for example "compet" is in the list. The word from the document is "competition". "Competition" is longer than "compet" so you cannot find "competition" in "compet". But you can find "compet" in "competition".

This is a code snippet that worked for me:

    found = False
    for word, start, stop in token_indices:
    # NELSON - changed loop so word can be found in GENDERED_WORDS
        if word.lower() in GENDERED_WORDS:
            found = True
            report.add_flag(
                    Flag(start, stop, Issue(
                        "{word}".format(word=word)
                        ))
            )